I have the following query:
SELECT
m.id_ AS match_id,
mv.id_ AS match_version_id,
t.id_ AS tournament_id
FROM
test.match_version AS mv
JOIN
test.match_ AS m ON m.id_ = mv.match_id
JOIN
test.tournament AS t ON t.orig_id = m.tournament_orig_id
AND t.tour_id = m.tour_id
ORDER BY mv.id_ DESC
LIMIT 100;
It is taking around 4 seconds to run where the match tables have circa 1m rows and the tournament table has circa 30k rows. If I remove the ORDER_BY
then it runs in 0.002 seconds.
Here are the CREATE TABLE
statements for each table:
CREATE TABLE `match_` (
`id_` int NOT NULL AUTO_INCREMENT,
`tour_id` tinyint NOT NULL,
`tournament_orig_id` int NOT NULL,
PRIMARY KEY (`id_`),
KEY `ix__match__tour_id` (`tour_id`),
KEY `ix__match__tour_id__tournament_orig_id` (`tour_id`,`tournament_orig_id`),
KEY `ix__match__tournament_orig_id` (`tournament_orig_id`),
CONSTRAINT `fk__match__tournament` FOREIGN KEY (`tour_id`, `tournament_orig_id`) REFERENCES `tournament` (`tour_id`, `orig_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1666470 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
CREATE TABLE `match_version` (
`id_` int NOT NULL AUTO_INCREMENT,
`match_id` int NOT NULL,
`generation_start` int NOT NULL DEFAULT '0',
`generation_end` int NOT NULL DEFAULT '0',
PRIMARY KEY (`id_`),
KEY `ix__match_version__match_id` (`match_id`),
KEY `ix__match_version__match_id__generation_end` (`match_id`,`generation_end`),
KEY `ix__match_version__generation_end` (`generation_end`),
CONSTRAINT `fk__match_version__match_id` FOREIGN KEY (`match_id`) REFERENCES `match_` (`id_`)
) ENGINE=InnoDB AUTO_INCREMENT=1669206 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
CREATE TABLE `tournament` (
`id_` int NOT NULL AUTO_INCREMENT,
`tour_id` tinyint NOT NULL,
`orig_id` int NOT NULL,
PRIMARY KEY (`id_`),
UNIQUE KEY `uq__tournament_oc__tour_id__orig_id` (`tour_id`,`orig_id`),
KEY `ix__tournament_oc__tour_id` (`tour_id`),
KEY `ix__tournament_oc__orig_id` (`orig_id`)
) ENGINE=InnoDB AUTO_INCREMENT=30962 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
Here is the EXPLAIN
result:
+----+-------------+-------+------------+-------+-------------------------------------------------------------------------------------------------+----------------------------------------+---------+-------------------------------+-------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+-------------------------------------------------------------------------------------------------+----------------------------------------+---------+-------------------------------+-------+----------+----------------------------------------------+
| 1 | SIMPLE | t | NULL | index | uq__tournament_oc__tour_id__orig_id,ix__tournament_oc__tour_id,ix__tournament_oc__orig_id | uq__tournament_oc__tour_id__orig_id | 5 | NULL | 30900 | 100 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | m | NULL | ref | PRIMARY,ix__match__tour_id,ix__match__tour_id__tournament_orig_id,ix__match__tournament_orig_id | ix__match__tour_id__tournament_orig_id | 5 | test.t.tour_id,test.t.orig_id | 48 | 100 | Using index |
| 1 | SIMPLE | mv | NULL | ref | ix__match_version__match_id,ix__match_version__match_id__generation_end | ix__match_version__match_id | 4 | test.m.id_ | 1 | 100 | Using index |
+----+-------------+-------+------------+-------+-------------------------------------------------------------------------------------------------+----------------------------------------+---------+-------------------------------+-------+----------+----------------------------------------------+
How come I’m getting considerably poorer performance when the ORDER_BY
is present? How would I solve this?
Note: Worth mentioning that I know it’s not ideal that there is no tournament_version.tournament_id
foreign key that links to tournament.id_
but that’s sadly what I’m working with atm!
3
Answers
Your query appears to be running into issues with the mysql order by limit slow rate look up. Unfortunately, this is a missing optimization in mysql.
See additional details here to understand why this query is slow. At the bottom of the page, it shows some strategies for how to speed up queries like this: https://explainextended.com/2009/10/23/mysql-order-by-limit-performance-late-row-lookups/
If that doesn’t work you can try to restructure your query to not include both ORDER BY and LIMIT. As you probably noticed from toying with your query, if you can remove either of those keywords, it will be fast again.
Your indexes are very close. I would have the following indexes
You want the index to have the key components that qualify the join conditions, but it would also help to include the "ID" to make it a "covering" index. This basically means that all the data you have in the indexes can resolve all the parts of the query. The only thing missing in your indexes are the IDs.
A covering index has the extra _id columns, so the engine never needs to go back to each individual page of data for the records. It has it in the index and never needs to go there. What you have, it has the index components for the joining, but once you throw out that you want the respective IDs, it now requires you to go back to the raw data to get that piece.
Since you are also applying an ORDER BY, it has to return ALL the records that qualify, thus looking at every record before finding the top 100 you are trying to limit.
You have no date context to further limit a given tournament, or range of, so yes, you are literally processing EVERY tournament and match. At a minimum, I would start by the indexes I am suggesting.
I recommend these:
A
FOREIGN KEY
is two things:INDEX
— equally well)