Data preparation, dashboards, knowledge... Data teams pay a high price for their technical debt.
Last week, we had a very interesting internal discussion with our developers and designer about how to deal with our design system and its iterations. Even as a young company, we understand that if we don't catch up quickly on the various iterations of our design system, our "Figma" and our codebase will start to diverge leading to the famous but lovedhated technical debt.
Technical debt lives up to its name. When not repaid, "interests" start to accumulate and any further change has an additional hidden cost. Your team's velocity slowly decreases and the team becomes rigid and stagnant.
This is very well known and described for software development. But what does it mean for your data team? How do you spot technical debt? How do you fight it? I focus here on the less technical ones.
First data technical debt is most probably linked to data not being prepared/transformed. Giving access to the data warehouse without any transformation/preparation for the most used data might have saved you time over the first few weeks. You're not the bottleneck. But very quickly, you'll face data quality issues with stakeholders requesting the same data differently.
How do you spot it?
You get 10 times the same question about a specific metric
How do you fight it?
The Pareto thing here. You should focus on the most used data points and make it super friendly to query and scale from here.
Receive similar content twice a month, along with the latest data news, tips and attractive European job offers.
I can't count the amount of teams coming to us with the issue of "I have too many dashboards and I spend too much time maintaining them". Dashboard tools make it super easy to create new dashboards. They even make it easy to organize them in folders and stuff. The issue lies down in the fact that data teams often create dashboards to answer any type of questions. And then those dashboards stay live for the next 2 years leading to 5 slightly different dashboards that they have to maintain.
Too many requests like "what is the difference between those 3 dashboards? Why don't they display the exact same values?" or "Hey! Dashboard X is broken"
Not everything is meant to live in a dashboard. Ask yourself the frequency at which people will need to access this data point. It should be at the very least monthly. Ask yourself if the metrics you're looking at is about a single event in time (think a release or a marketing campaign) or can be followed over time. Delete dashboards that are not used enough — you'll always find someone complaining about it but stay strong here and say no.
"I remember answering a pretty similar question in Slack. Let me find it for you"... you'll be dead before you can search through all your Slack content for the piece of information you're looking for. There are way too many messages exchanged in Slack every day.
You started from scratch different analyses you've already run in the past but you're not able to find back the previous results.
Documenting the questions you've been asked and the answers you provided is the smartest move. It can be in a simple spreadsheet or a Notion document. (Or directly in a dedicated tool)
It can be as easy as a 7 columns sheet: question - takeaways - link to work - domains (#hashtags) - requester - analysts - updated at.
Here, I focused on the less "technical" part of the technical debt. I could even have called it "process" debt. But be careful technical debt continues to hide in your various sql practices, your continuous pipeline deployments etc...!
I won't lie, we are convinced Husprey's notebooks allow data teams to drastically reduce their technical debt. Analysts are able to document on-the-go, as they work and Husprey lets them search through the history of their work. Also, our notebooks' design with Business Goals and Takeaways at the top make it super easy to read and reuse what was done! See for yourself and sign up to Husprey!