In an era where the cost of acquiring new customers is on the rise, retention has become the golden tool for businesses.
I'd like to delve into the definition of retention, explore the use of cohorts for comparing retention across diverse user groups, and finally, demonstrate how you can construct lightweight prediction models using your data warehouse (with an example from BigQuery).
This will empower you and your team to retain users and combat some leaky user buckets.
In simple terms, retention is the measure of how many users return to your product over time. It's the percentage of active users at any given time after X retention windows from their first retention event.
To define your internal retention, you need to select a retention window and a retention event.
The retention window depends on your ideal persona's usage frequency:
The retention event is the event you use to consider a user active or retained. This can be any active event, such as logging in, opening your app, buying a product, or using a specific feature.
While it needs to make sense for your business, you should aim for consistency to allow for benchmark comparisons. For example, you might refer to the B2B User Retention Product Benchmark or ChartMogul SaaS retention report.
To fully grasp the concept of retention, we must talk about cohorts.
While you can compute retention for your entire user base, impact of potential improvements will be difficult to discern. This is where cohorts come into play.
A cohort is a group of users (or companies) that share a common characteristic.
By default, retention is computed using a time-based cohort, such as users who signed up in 2023. If you're more familiar with marketing terminology, you can think of this as "an audience".
In fact, you should define different cohorts:
The idea is to compare retention across different cohorts and prioritize work (product or customer success) towards the cohorts where the leverage is greatest.
However, a word of caution: make sure to check your cohort sizes. You might not want to spend too much time improving the retention of your smallest cohort!
After assessing your retention levels and implementing new activation mechanisms to improve retention at the cohort level, it's time to combat churn at the account level.
However, it's easier to fight churn BEFORE someone actually churns rather than after. This is where predictions come into play.
Churn is the opposite of retention. It's when a user stops returning. Nowadays, it's easier than ever to create basic models to predict the probability of churn for an account or a user.
In BigQuery, for example, it's a three-step process:
To create a churn modeling table in BigQuery, you can consider the following dimensions (or features 🤓):
Once training is done, you can now predict the churn probability for each account and share that with your CSM team.
You can know join forces to retain customers by manually reactivating them or including them in some specifically designed campaigns. Example in Game Developers.
Retention, while often complex to grasp, can be a key driver in any business's growth when used appropriately. It's time to say goodbye to the "leaky bucket".
This basic modeling can be deployed as one-shot in about a week of work at companies of "normal" scale. If you want to run it in production every day, a bit more tooling, checks will be needed.
Next step will be to discuss Net Dollar Retention (NDR): how to increase revenue per account over time. This often requires changes to the pricing model, (which can be more challenging to implement than small Customer Success Management campaigns). Best companies achieve constant account growth.
If you need examples of retention sql queries, have questions about implementing churn prediction in BigQuery, or want to share your own experiences, I'd be more than happy to chat.
PS: if you want more BigQuery hacks you can also check BigFunctions amazing work from Paul Marcombes