Newsletter

Implementing reverse ETL to support our Marketing team

We faced our first Modern Data Stack architecture questioning while helping our Marketing team acquisition process. Read more to see how we fixed it!

Picture of author and link to their profile
Thibaut Collette

February 14, 2023 · 3 min read

Back Arrow with text: Reverse ETL - Facing MDS architecture issues

We recently faced our first Modern Data Stack architecture questioning at Husprey! We're about to release a new onboard emails sequence (ETA end of the week!) while this is something that we did in the past and that all companies face very early on, we faced some common pitfalls that I wanted to share with you.

Onboarding emailing sequence

While our acquisition is getting better we wanted to support it with a great onboarding emailing sequence. Goal is to activate new users and improve conversion from test to continuous use. We discovered that there were key adjacent actions to make this happen:

  • Maximize the understanding of our value proposition from an actual usage
  • Boost first users' satisfaction by helping them discover the Wow features
  • Engage new users by easing their feedback loop

Finally, our conviction is that in order for this to happen we needed to personalize those emails to each users. Has someone connected a sample data source to try out? Or a private internal one? Has someone invited some colleagues? Emails can be personalized depending on specific users' attributes.

Separation of concerns and business teams autonomy

The shortest path to achieve this could have been to ask the dev team to code something directly in our application sending email through an api. This solution would be easy to implement right now while most probably very costly to maintain, bringing all the issues the Modern Data Stack is fighting against.

First on the list, the separation of concerns.

Modern Data Stack often needs an initial tractive effort to Extract and Load data from various sources to a central data warehouse.

However, once setup is behind, it gives flexibility for data teams (or advanced business users in smaller teams) to make sure user attributes and various aggregations can be updated — (the Transformation part). This becomes the single source of truth that can be synced to many downstream tools (CRMs, Segmentation, Emailing, ...). This is the reverse-ETL part.

Second, the autonomy of non-technical teams. Once data flows back to business tools action and decision is easier. For example, a Marketer can now change wordings, run tests without bothering a developer. #iykyk  

When content is embedded in the code, it is very often a pain to update

Data quality issues linked to customer experience

So the Modern Data Stack is a great place to start even for small project to spur autonomy in the team and ensures a separate of concerns for dev, business (and data) teams.

So we did just that. We computed the right aggregates for our (awesome) users and now the data flows back to our emailing system.

But then, we faced a small architectural hiccup. Our ELT operations were running on a schedule. Our reverse-ETL on a different schedule and finally our emailing on a third schedule.

That uncontrolled waterfall led to outdated data being pushed in our test emails. Customer experience issues directly coming from a data quality issues.

Everything to keep me up at night.

A solution made of tradeoffs (as always?)

There are plenty of solutions that could specifically answer our problem. The question is how quickly we could come up with one while limiting maintenance costs and risks.

The solution:

1/ Force first onboarding email to be sent right away by sending limited set of user information from our backend application right at signup.

2/ Synchronize email sendings right after ELT and reverse ELT syncs.

This solution is easy to implement but has 3 caveats:

  • It does not work smoothly for attributes that get updated with a high frequency (like number of executions). We decided to limit our usage to attributes that change less often has_executed_a_query or has_used_local_tables.
  • It might not be absolutely robust for missing or longer than usual syncs. At the stage we are at it's a risk that we were willing to take. We will monitor them closely to see if any action is required.
  • Customer timezones might be a pain to deal with. There is always a pain to deal with actually. We force multi syncs a day limiting the impact for timezone far away from the servers.

And tada, new onboarding sequence to be sent by the end of the week. Data collaboration at its finest.

If you aren't a Husprey user yet, feel free to create an account and receive this new onboarding sequence. We hope it will be concise but impactful.

Husprey Logo

Learn more about Husprey

Husprey is a powerful, yet simple, platform that provides tools for Data Analysts to create SQL notebooks effortlessly, collaborate with their team and share their analyses with anyone.