I have a SQL query when joining several large tables. My database consists of millions of rows across the users, orders and transactions table. Here is the query
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
JOIN orders o ON u.id = o.user_id
JOIN transactions t ON o.id = t.order_id
WHERE t.status = 'completed'
GROUP BY u.name
HAVING order_count > 10;
I ensured that there are indexes on all the necessary fields. users.id, orders.user_id and transactions.order_id. However the query becomes significantly slower as the dataset size increases.
The order table is approximately 5 million rows.
The transactions table is approximately 10 million rows.
The database is running on MySQL 8.
What optimization techniques or query restructuring could help improve performance in this case?
2
Answers
This delays the name lookup until after the
GROUP BY
This is ‘composite’ and ‘covering’ and useful for the transaction lookup. And the columns are in the optimal order:
If these suggestions do not help enough, please provide
SHOW CREATE TABLE
for both tables andEXPLAIN SELECT ...
.Try to create thoses indexes, they are covering…
Single columns indexes are good if the query returns few rows without any aggregation.