SQL queries

What are SQL queries?

The acronym SQL stands for “Structured Query Language”. It refers to a standardized computer programming language that is used for creating, altering and querying data in many relational database management systems. Basically, a query – as its name suggests – is a question that you ask or a request you send to a database.

Given that SQL is a universal language, it proves very useful and appropriate for easy communication and collaboration. It’s also quite flexible and versatile, which means that once you know the basics, you can do a lot of things with it. 

This makes SQL queries essential for any organization that strives to be data-driven: it empowers stakeholders to access and manipulate data while staying organized and following a clear-cut structure. Without SQL queries, extracting meaningful data from a relational database can be tricky, and data management processes tend to become quite complex and time-consuming. 

How does it work?

In a relational database, information is stored in tables, each one representing a specific entity type. In these tables, the rows are fields that indicate the instances of this type of entity, whereas the columns indicate the values attributed to that instance. 

In order to make the most of a relational database, you need to make clear and accurate SQL queries – which means that you need to know:

  • where the data comes from (which table or database);
  • which fields you want to select;
  • whether you want to set conditions for this selection: do you want all the data, or a specific number (in relation to a precise characteristic)? 

SQL query example

You want to know the number of organizations created in Husprey since January 1st 2022. For confidentiality reasons, all the data presented below has been changed.

Your database table is called product.organization, and is structured as such:

Extract of product.organization database

Your SQL query is written as such:

SQL query

Here is an explanation of each row of the query:

  • (1) Select: all Analytics SQL queries start with this word. It indicates that you’re going to select values.
  • (2,3) Determine which values you wish to select and extract
  • (2) Truncate the date at the beginning of each month. It’s a well-known method that is used to make sure you can link all the lines to a single month.
  • (3) Aggregate the result. Indicates that for every month, you count all the lines containing a distinct id value (id is a diminutive for “identify” in general). You later ask to rename the result column: organization_count.
  • (4,5) Indicate which database you want to extract the data from.
  • (6,7) Filter the data you want to select. Here, only the organizations created after 2020 will appear.
  • (8,9) Indicate how to aggregate the results. Specify that you want to make groups for each month. The monthly result will be called organization_count.
  • (10,11): Indicate that you want your result table to be sorted by month in ascending order.

After running the query, a result table will appear, as the count of New Organizations per month:

SQL results - Number of organization by months

Tips for optimizing your SQL queries

For your SQL queries to be useful and beneficial, you need to prioritize clarity: only then will stakeholders be able to understand the queries, and limit the number of errors.

  • delete, if possible, all unnecessary joins and subqueries;
  • create and use indexes;
  • limit the use of wildcard characters;
  • give your tables and columns clear and descriptive names; 
  • improve your queries’ readability by working on the layout: add indentations, line breaks and whitespaces.

You also need to make your SQL queries as exhaustive and explicit as possible, so as to encourage and facilitate collaboration: 

  • add comments to your queries and explain their purposes, sources, as well as the assumptions you made;
  • document your queries by indicating their purposes, inputs and outputs.

This will also enable stakeholders to understand and reuse these queries in the future!

Husprey Logo

Learn more about Husprey

Husprey is a powerful, yet simple, platform that provides tools for Data Analysts to create SQL notebooks effortlessly, collaborate with their team and share their analyses with anyone.