Introduction
SQL is the predominant language for database queries, and proficiency in SQL is important for accurate data querying. This requires a comprehensive understanding of the sequence in which SQL executes its clauses. Debugging your SQL script effectively and creating precise queries necessitates knowledge of how a database interprets and executes your SQL query.
In this article, we will discuss the specific sequence in which the clauses of an SQL query execute. However, if your query includes sub-queries or Common Table Expressions (CTE), bear in mind that these will always be executed first before any action takes place on the main query. Nonetheless, the execution order of clauses within a CTE or subquery remains unchanged.
We will be referencing the following two tables:
Customers
| customer_id | customer |
|---|---|
| 1 | Bruce Wayne |
| 2 | Clark Kent |
| 3 | Tony Stark |
| 4 | Bruce Banner |
| 5 | Peter Parker |
Purchases
| purchase_id | item | price | customer_id |
|---|---|---|---|
| 1 | Red Cape | 3.75 | 2 |
| 2 | Web Shooter | 9.26 | 5 |
| 3 | Batarang | 23.24 | 1 |
| 4 | Smoke Pellet | 2.99 | 1 |
| 5 | Red Boots | 17.41 | 2 |
| 6 | Sunglasses | 299.99 | 3 |
| 7 | Lab Coat | 74.23 | 4 |
Here is our SQL query, which identifies the two customers who have spent the most money, excluding purchases exceeding $200 and customers whose total purchases are less than $10:
SELECT
customers.customer_id,
customers.customer,
SUM(price) as
total_money_spent
FROM customers
INNER JOIN
purchases
on customers.customer_id = purchases.customer_id
WHERE
price < 200
GROUP BY
customers.customer_id,
customers.customer
HAVING
total_money_spent > 10
ORDER BY
total_money_spent desc
LIMIT
2
Here is the sequence of execution, breaking down what occurs at each stage:
FROM(including joins:INNER JOIN,LEFT JOIN,RIGHT JOIN,OUTER JOIN,CROSS JOIN, etc.)WHEREGROUP BYHAVINGSELECTORDER BYLIMIT
Step 1: FROM and JOINS
FROM customers
INNER JOIN
purchases
on customers.customer_id = purchases.customer_id
The complete customers table is invoked and combined with the purchases table based on the customer_id, resulting in a new primary table that includes the matches from both tables. Therefore, after our query execution, the database assembles this primary table:
| customer_id | customer | purchase_id | purchase | price |
|---|---|---|---|---|
| 1 | Bruce Wayne | 3 | Batarang | 23.24 |
| 1 | Bruce Wayne | 4 | Smoke Pellet | 2.99 |
| 2 | Clark Kent | 1 | Red Cape | 3.75 |
| 2 | Clark Kent | 5 | Red Boots | 17.41 |
| 3 | Tony Stark | 6 | Sunglasses | 299.99 |
| 4 | Bruce Banner | 7 | Lab Coat | 74.23 |
| 5 | Peter Parker | 2 | Web Shooter | 9.26 |
Step 2: The WHERE Clause
WHERE
price < 200
The WHERE clause serves as our filter, enabling us to omit undesired data from the main table and retain the data we wish to view. In this scenario, we are retaining all purchases below $200, thereby excluding the sunglasses purchase valued at $299.99.
Note: It's important to bear in mind that you can't utilize a WHERE clause on any columns that are undergoing aggregation (sum, avg, etc) in the statement. For this purpose, you'll need to use the HAVING clause, which we'll discuss later. If you are aggregating, the WHERE clause will omit rows BEFORE the aggregation commences. Therefore, in the context of our table, we will NOT be excluding total purchases exceeding $200.
| customer_id | customer | purchase_id | purchase | price |
|---|---|---|---|---|
| 1 | Bruce Wayne | 3 | Batarang | 23.24 |
| 1 | Bruce Wayne | 4 | Smoke Pellet | 2.99 |
| 2 | Clark Kent | 1 | Red Cape | 3.75 |
| 2 | Clark Kent | 5 | Red Boots | 17.41 |
| 4 | Bruce Banner | 7 | Lab Coat | 74.23 |
| 5 | Peter Parker | 2 | Web Shooter | 9.26 |
Step 3: Group Columns with GROUP BY
GROUP BY
customers.customer_id,
customers.customer
As there is an aggregation (SUM) in our query, GROUP BY will execute, aggregating the price by the two non-aggregated columns (customer_id and customer).
Note: When there is an aggregation in your statement, it's imperative to group by all non-aggregated columns that you are incorporating into your query. In this instance, since we are including both customer_id and customer in our SELECT clause, we must GROUP BY both of these columns. The sequence here is deliberate as it ensures that the query executes GROUP BY with their distinct ID first and then their name.
| customer_id | customer | price |
|---|---|---|
| 1 | Bruce Wayne | 26.23 |
| 21.16 | ||
| 4 | Bruce Banner | 74.23 |
Step 4: The HAVING Clause
HAVING
total_money_spent > 10
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
This clause enables us to filter by our aggregation as this occurs post-GROUP BY, when the data has been aggregated. It can't be used in place of a WHERE clause. In this scenario, we're excluding total purchases under $10 (poor Peter Parker).
| customer_id | customer | price | |
|---|---|---|---|
| 1 | Bruce Wayne | 26.23 | |
| 2 | Clark Kent | Peter Parker | 21.16 |
| 4 | Bruce Banner | 74.23 |
Step 5: SELECT Statement
SELECT
customers.customer_id,
customers.customer,
SUM(price) as
total_money_spent
This clause specifies the particular columns we wish to extract from the primary table we've assembled. In this instance, since we've used the GROUP BY clause and performed aggregation, we've already narrowed down to these columns. However, outside of aggregation, this step is pivotal to ensure that we're only extracting the desired data and assigning a suitable alias to each column, if required.
| customer_id | customer | total_money_spent |
|---|---|---|
| 1 | Bruce Wayne | 26.23 |
| 2 | Clark Kent | 21.16 |
| 4 | Bruce Banner | 74.23 |
Step 6: Utilizing ORDER BY
ORDER BY
total_money_spent DESC
This clause facilitates the sorting of the table. We can ORDER BY column ASC (Ascending, which is the default) or ORDER BY column DESC (descending). Additionally, you can ORDER BY multiple columns. In this instance, we are sorting by our aggregated column in descending order (note that ORDER BY is one of the few instances where the aggregated column's alias can be used because it occurs after the SELECT clause)
| customer_id | customer | total_money_spent |
|---|---|---|
| 4 | Bruce Banner | 74.23 |
| 1 | Bruce Wayne | 26.23 |
| 2 | Clark Kent | 21.16 |
Step 7: Setting LIMIT
LIMIT
2
This clause restricts the number of data rows we wish to retrieve. For instance, in our scenario, we are interested in only the top two customers. Therefore, setting a LIMIT of 2 confines our results to Bruce Banner and Bruce Wayne.
| customer_id | customer | total_money_spent |
|---|---|---|
| 4 | Bruce Banner | 74.23 |
| 1 | Bruce Wayne | 26.23 |
When you're merging multiple SQL queries vertically using Union or Union All, it's important to remember that both ORDER BY and LIMIT indicate the conclusion of the clause. Here's a brief example:
SELECT
customer_id,
customer
FROM customers
WHERE
customers.customer = 'Clark Kent'
UNION
SELECT
customer_id,
customer
FROM customers
WHERE
customer = 'Bruce Wayne'
ORDER BY
customer_id
LIMIT
2
While the above query won't result in any errors, the ORDER BY and LIMIT won't execute until after the queries have been merged into a single table, instead of just on the second query. Consider the following example as well:
SELECT
customer_id,
customer
FROM customers
WHERE
customers.customer = 'Clark Kent'
ORDER BY
customer_id
UNION
SELECT
customer_id,
customer
FROM customers
WHERE
customer = 'Bruce Wayne'
This example will generate an error because ORDER BY indicates the termination of the query. The same applies to LIMIT, so it is impossible to execute the second SQL query after UNION at the bottom. If you wish to include more than one ORDER BY and/or LIMIT in a UNION/UNION ALL statement, you must use parentheses to enclose the queries:
(SELECT
customer_id,
customer
FROM customers
WHERE
customers.customer = 'Clark Kent'
ORDER BY
customer_id)
UNION
(SELECT
customer_id,
customer
FROM customers
WHERE
customer = 'Bruce Wayne'
ORDER BY
customer_id)
Conclusion
Understanding the order in which SQL clauses execute can enhance your ability to write dynamic, accurate, and efficient queries that extract the precise data required for your projects. This understanding will also aid in troubleshooting your SQL queries when data retrieval is erroneous, by facilitating careful tracking of your steps in the sequence of execution until the problem is identified. Happy querying!


