I have this query I’m running in MySQL:
SELECT
count(*)
FROM
library AS l
JOIN plays AS p ON p.user_id = l.user_id AND
l.path = p.path
WHERE
l.user_id = 20977 AND
p.time >= '2022-10-17';
When EXPLAIN ANALYZE is run:
| -> Aggregate: count(0) (cost=1085653.55 rows=6692) (actual time=12576.265..12576.266 rows=1 loops=1)
-> Nested loop inner join (cost=1084984.37 rows=6692) (actual time=40.604..12566.569 rows=56757 loops=1)
-> Index lookup on l using user_id_2 (user_id=20977) (cost=116747.95 rows=106784) (actual time=13.153..3783.204 rows=59631 loops=1)
-> Filter: ((p.user_id = 20977) and (p.`time` >= TIMESTAMP'2022-10-17 00:00:00')) (cost=8.24 rows=0) (actual time=0.135..0.147 rows=1 loops=59631)
-> Index lookup on p using path (path=l.`path`) (cost=8.24 rows=8) (actual time=0.090..0.146 rows=1 loops=59631)
|
1 row in set (12.76 sec)
I obviously want to make this faster!
Table definitions
CREATE TABLE `library` (
`user_id` int NOT NULL,
`name` varchar(20) COLLATE utf8mb4_general_ci NOT NULL,
`path` varchar(512) COLLATE utf8mb4_general_ci NOT NULL,
`title` varchar(512) COLLATE utf8mb4_general_ci NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`edited` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`db_id` int NOT NULL,
`tag` varchar(64) COLLATE utf8mb4_general_ci NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
CREATE TABLE `plays` (
`user_id` int DEFAULT NULL,
`name` varchar(20) CHARACTER SET utf8 DEFAULT NULL,
`path` varchar(512) COLLATE utf8mb4_general_ci DEFAULT NULL,
`time` datetime DEFAULT CURRENT_TIMESTAMP,
`play_id` int NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
ALTER TABLE `library`
ADD PRIMARY KEY (`db_id`),
ADD KEY `user_id_loc` (`user_id`,`name`,`path`(191)),
ADD KEY `edited` (`edited`),
ADD KEY `created` (`created`),
ADD KEY `title` (`title`),
ADD KEY `user_id` (`user_id`),
ADD INDEX `user_id_by_title` (`user_id`, `title`);
ALTER TABLE `plays`
ADD PRIMARY KEY (`play_id`),
ADD KEY `user_id` (`user_id`,`name`,`path`(255)),
ADD KEY `user_id_2` (`user_id`,`name`),
ADD KEY `time` (`time`),
ADD KEY `path` (`path`),
ADD KEY `user_id_3` (`user_id`,`name`,`path`,`time`);
It looks like the killer is the looping over 59631 rows.
Would an index on (user_id, time)
make it faster?
Interestingly, the user_id_2
index is actually an index on (user_id, title)
, rather than the plain user_id
index. I’m not sure why user_id_2
is chosen given title
isn’t used in the query.
2
Answers
I tested your query and tried a different index in each table.
The note "Using index" in each row of the EXPLAIN report shows that it’s getting the benefit of a covering index for both tables.
I didn’t use prefix index syntax, because that would spoil the covering index optimization. It’s not necessary to use prefix indexes for this example on modern MySQL versions because they default to an InnoDB row format that supports 3072-byte indexes instead of the old MySQL that only supported 768-byte indexes by default.
In my test, I had zero rows in the tables I tested, so I had to use an index hint to make the optimizer choose my new indexes. In a table with a substantial number of rows, the optimizer might choose the new indexes on its own.
DROP these, they in the way and/or redundant:
Add these:
Change (MySQL 5.7/8.0 no longer needs the prefix kludge):
Try to avoid testing columns from different table in
WHERE
.First I saw
and assumed that was the crux of the problem. But then I saw that you did not have
INDEX(user_id, time)
onp
and that the tables are joined [partially] onuser_id
.Suggest (to avoid my confusion) that you make this change:
The Optimizer should be smart enough to realize that, then use
But, once you have done that, the query collapses to
I think that it will say "Covering index skip scan for deduplication" to indicate that it is not actually scanning all 60K rows in
plays
, but hopping through the index!However, if there are "plays" that do not have a corresponding entry in "library", then the count will be high by the number of missing user-play combos.