Finding the right organization for a data team is complex. Should it be centralized, decentralized, federated, or custom made? The most common answer is “It depends”. (More great readings on the topic below).
Are there specific ratios to keep in mind while growing? Mikkel Dengsøe wrote a great piece about this question. When focusing on the data analytics sub-teams, choices are even harder due to their highly specific positioning at the frontier of business and data.
As the data pipeline matures (what I mean here is “As the Modern Data Stack eats up the world”), analytics teams iterate over their structure, create new roles (hello analytics engineers) but still expect greater impact on business decisions. It, very often, means to keep improving at:
While the first 2 goals are distinct, they are almost always operated in the same thread: within the same workflow, by the same people, and using the same tools. This slowly cripples your Analytics team velocity and discourages Business Users to ask (the right) questions.
To fix this slow infection, you’ll need to transform your team and processes so it can become multi-threaded. You separately tackle self-serve actions and advanced questions.
The good news: the infection doesn’t come unannounced.
On Slack, you start receiving more requests such as: “Sorry mate, I can’t remember where is the dashboard about the retention?” or “Hey! I checked but didn’t find the product usage dashboard”. When business users start asking more of these, it basically means that there are too many dashboards. When looking for some data, they have too much of it and yet no insights.
You will start feeling overloaded, focusing with “no value-added” tasks while a sense of frustration among Business Users will grow. Because answering small requests takes up a significant portion of your time, answers to more complex questions or deep-dives will be delayed. Over time, the number of advanced Business questions asked to the Analytics team will start to go down.
But it’s never too late, to take back control. You need to transform your team and processes so it can become multi-threaded and more predictable.
Ideally, you would enforce multi-threading by creating 2 distinct sub-teams. One would be responsible for providing the best self-serve experience. The other one would be in charge of answering Business questions, build ad-hoc reports and deep-dives.
Designing a dashboard and writing an analysis are two very independent tasks
Designing a dashboard and writing an analysis are two very independent tasks. Some Data Analysts love to dig into the data, iterate over and over in order to find “the” insight. (Does it exist?). Many others would prefer to carefully design a dashboard that will help monitor the operations, and that will be accessed by an entire team on a daily basis. Those differences often match your team members’ personality traits and splitting up the team will help them focus on what they like the most and be happier at work.
As a good multi-threaded system, interference from one sub-team to another should be limited. If there is a higher demand, on one side, then the other team will continue to deliver as expected. This de-coupling will restore the trust from Business users as they see clearer timelines.
What if I am solo in my team? What if we can’t split again after splitting by departments? What if we prefer to do both deep-dives and self-serve?
Your split should still happen but it should be time based. Tasks and topics would then shift on a regular basis. To make sure you benefit the most from those changes, you should make sure you don’t change topics too often. From the feedback received in different teams, switching at an interval of less than a week is not recommend.
What ratio of self-serve analysts vs. deep-dive analysts should I have?
Well, once again, “it depends”. This ratio will basically evolve a lot depending on the quality of your Transform layer, on the company’s data literacy, the size of your team… However, you already have a self-serve to ad-hoc ratio. To start your transition into this “multi-threaded” analytics, aim at keeping it the same ratio while improving on creating different context for each thread. Start from here and iterate.
Traditional BI solutions offer great self-serve options from Looker to Tableau to Metabase (and many more). They all have their key differentiators: better dashboards, better chart capabilities, better discovery over non-prepped data, ...
Dashboards display only the data which most of the time actually brings in new questions.
And yet their focus mostly stayed on providing the best self-serve experience. Dashboards are really good at making sure Business is going as expected, detecting failures or spikes. They display only the data which most of the time actually bring in new questions. Why do I see what I see in the data? How can I change this trend? Answering those questions requires digging and thinking — here enter ad-hocs and deep-dives.
Let’s take an example. In the Acquisition dashboard, the Marketing team just saw their Cost of Acquisition (CAC for those in the know) take a leap. Warning. They try to hack some conclusions on a spreadsheet but decide they need the help of their favorite Data Analyst. After some initial discussions to refine the underlying need and some real effort, our Data Analyst is able to deliver her conclusions. She sends a Slack message with a link to a Notion/Confluence doc with her conclusions, a bunch of copy pasted charts and a link to a dashboard.
While it seems fine, different issues emerge here.
A new tailored dashboard was added to your already long list of dashboards. As dashboards stay live, they should always contain up-to-date data and it won’t be long before someone pats on your shoulder and disgracefully informs you that “The dashboard is broken”. You’ll have to fix it. Congrats, you found what developers call “Technical debt”. As you continue to run more ad-hocs, you’ll start hearing more of those “CAC can be found in 3 dashboards. Which one should I look at?” requests. Ad-hoc analyses should create snapshots views, not live dashboards.
But there is more.
Answering questions is an actual iterative process. A flow of assumption and conclusions. You go back and forth from raw data to formatted data to new visualizations to insights. This loop should be as short as possible. Feedback from Business Users should happen during these iterations, way before you produce your final deliverable. Otherwise, you’ll slide into a copy/paste chaos that you’ll regret later when updating all your charts in your final report. Also, the iterative process has value by itself and you should keep the flow as a reference for future use (either for you or your newcomer) as it will improve your response time for the next analyses.
Notebooks present themselves as the best way to share reports containing both the flow, the conclusions and the underlying data.
Analytics teams already spend time on various tasks but splitting up the team and tailoring the right processes/tools ensure happy Analytics team and happy Business Users.
Designing dashboards and answering advanced Business questions actually ask for different personality traits in your team members.
Dashboards are great for monitoring but create technical debt when used to serve ad-hoc purposes. Notebooks, on the other hand, mix the data and the brain juice of your Analytics team while speeding up the iterative flow.
De-coupling self-serve and ad-hoc work will bring magic to your team!