Some of you might already be aware that before founding Husprey, I worked in a small start-up where we developed a "chat with your data" bot for Slack.
That was back in 2018! Here is a link to demo video from the product at that time — please be gentle with your critique on the design. During those days, I built a strong conviction against text-to-sql.
Times have changed, technology has evolved. Is it now time for me to reconsider my stance on text-to-sql?
Back in 2018, language models primarily utilized deep learning architectures, such as Recurrent Neural Networks (RNNs) or Long Short-Term Memory (LSTM), and were specifically trained to recognize and classify "Named Entities" (e.g., Organizations, Countries, Persons, Dates) and understand the intent of sentences.
This approach was employed not only for building our "chat with your data" bot, but also for designing voice assistants like Alexa and Siri.
Recent advancements in Large Language Models (LLMs) have resulted from the introduction of a new "Transformers" architecture. These models are pre-trained on massive amounts of text data, enabling them to deeply comprehend language patterns and representations.
Traditional Intent/Entity extraction has now become obsolete, as LLMs can perform these tasks more effectively, while also:
These additional use cases have shed new light on the possibilities of "chat with your data" and reopened the discussion on "text-to-sql" applications.
Many people dream of having a robot that can instantly answer any question they ask.
In fact, they often wish the analysts in their company could respond just as quickly. However, this isn't likely to happen anytime soon. For both.
A skilled analyst typically rephrases a question to provide a more accurate answer. Most of the time, the original question lacks context, and the underlying reasoning is hidden in someone's mind—the "why" (or sometimes the "whys," plural) behind the question.
When the context is either blurry or complex, machines will inevitably struggle. It's not a major issue, but it's important to remember that machines, as much as your analyst, require external context around the data to be truly helpful.
One might argue that since LLMs can generate code, they could simply generate sql and allow users to make suggestions and decisions themselves. However, writing code also requires context. Definitions for terms like "Active," "Completed," or "Churned" are not universal; there is no one-size-fits-all truth. As a result, answers must take these unique definitions into account.
That being said, if a "domain" is well-defined, if some data was carefully modeled, AI can answer basic questions or can at least direct users to places where they can find the necessary data.
Last but not least, AI can very well explain code. It might not know how you should write your sql from the very beginning but being given an example it can help you figure out why something was done a specific way and help you understand what is happening so you can reproduce.
BI tools have undergone numerous interface iterations over the years. While some could still be enhanced, most of them allow users to access data faster than if they were to describe their request in plain English, which is the beauty of well-executed UX design.
For example, when given a chart of revenue by country, clicking on the visible "Norway" button to filter results will be quicker than typing "and filter by Norway" on your keyboard. While a chat interface is powerful for navigating to a notebook or dashboard that isn't frequently visited, it falls short compared to a well-designed dashboard interface with clickable buttons.
Another significant challenge surrounding data question answering or text-to-sql generation is the AI "hallucination."
AI doesn't know something with absolute certainty; it derives results based on probability and confidence levels.
In our "chat with your data" experience, we censored results to ensure we "never answered something wrong." However, this approach leaves end-users with a choice between spending 3 minutes to find the results with certainty or 30 seconds with no guarantee of finding the answer. Most of the time, users opt for the first choice.
As mentioned before, trust (in the data and in the process) is the most valuable asset for your data team.
Trust relies on both communication and consistency. If a tool provides correct answers 90% of the time but fails the other 10%, then it's not a reliable BI tool, at least not for business users. You are then at risk of losing the trust you carefully built.
Here are 2 lists: one for the analytics use cases where AI can help and where it won't.
Keep in mind that the following suggestions have a 2-year expiration date. In other words, if anything in this list turns out to be incorrect, you can only mock me for the next 2 years.
Do I reconsider? Yes I do. Generative AI is a booster for data teams.
However, it's crucial not to be misled and to focus on use cases where GenAI excels while avoiding generic pitfalls that could undermine the trust you've spent months building.
And if I have a conclusion it is that AI is a great assistant to boost capabilities of someone with knowledge but will come short when answering questions from someone without initial knowledge.