Newsletter

Using generative AI for data analysis

The latest models of Generative AI are revolutionizing data analysis. How can data teams take advantage of it to make better decisions faster with improved accuracy?

Picture of author and link to their profile
Thibaut Collette

March 14, 2023 · 3 min read

Wires

Let's talk about it.

It's an exciting area of research revolutionizing the way many daily tasks are done and data analysis will change too. Analysts and Business Users will make better decisions faster, with an improved data quality and accuracy.

While it's Thibaut writing, I must admit this was proofread by AI.

GenAI is excellent at basic English

GenAI is powerful when it comes to analyzing English language content. Models are trained by reading billions of books and articles online, allowing them to accurately identify and quantify language used in a text. Proofreading or grammar fixes are then a perfect task for them, whereas we, as Humans, can quickly reach our limits while reviewing large pieces of content.

I can no longer imagine publishing a blog post or a newsletter article without AI proofreading. It is not about the actual content, but the final touches; typos and easy-to-fix English grammar issues.

Generative AI is a "few-shot learner"

This is probably the most striking feature the latest models, GPT-3 and ChatGPT, have. Previous AI models would require massive amounts of data to train models on specific use cases. The latest models are pre-trained on more data and more compute power, but are then pretty generic for a broad range of tasks and easily customizable with a single "prompt". Many "no-coders" could write great prompts in plain English, enabling the AI to execute advanced tasks without a (deep/any) AI background.

Accuracy of the questions you ask based on the examples in the context.
Accuracy of the questions you ask based on the examples in the context. Latest GPT-3 is in blue. Source: https://arxiv.org/pdf/2005.14165.pdf

GenAI can now be used by many, and is no longer limited to the closed circle of ML engineers.

What does it mean for data analysis?

Data analysis often means scanning unstructured data, identifying patterns, and making predictions and recommendations. These are tasks where GenAI excels (again).

  • Identifying correlations,
  • Suggesting best practices,
  • Autocompleting your SQL,
  • Generating column names,
  • Generating charts,
  • Summarizing your writing;
  • You name it.

These are all tasks that are part of the scope. GenAI will help analysts iterate faster, and amplify their impact.

Beware of limitations

While this is a blank field with great performance improvements to be seen, this won't replace analysts in the medium-term.

Firstly, because AI suggestions are based on existing, already written and documented solutions. Analysts' work will continue to thrive, and their impact will be more and more focused on defining the right problems, as well as creative solutions to those problems. They should also not forget what makes the essence of work: communication, collaboration, and education.

Secondly, because of trust. Again, GenAI suggestions are based on probabilities and are sometimes forced into an answer even though they are not 100% confident. This can lead to incorrect results, and therefore wrong conclusions and trust issues. As always, trust is mandatory for any data strategy to work, so the AI should only help you achieve more if it secures the trust link data teams have spent time building with their Business users.

What are the key things you expect the late GenAI development to bring to you day-to-day? For Husprey users, are there things you'd like to see in the product? 👀 If so, don't hesitate to send me an email with your suggestions!

Husprey Logo

Learn more about Husprey

Husprey is a powerful, yet simple, platform that provides tools for Data Analysts to create SQL notebooks effortlessly, collaborate with their team and share their analyses with anyone.