My table structure:
CREATE TABLE `jobs_view_stats` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`job_id` int(11) NOT NULL,
`created_at` datetime NOT NULL,
`account_id` int(11) DEFAULT NULL,
`country` varchar(2) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `IDX_D05BC6799FDS15210` (`job_id`),
KEY `FK_YTGBC67994591257` (`account_id`),
KEY `jobs_view_stats_created_at_id_index` (`created_at`,`id`),
CONSTRAINT `FK_YTGBC67994591257` FOREIGN KEY (`account_id`) REFERENCES `accounts` (`id`) ON DELETE SET NULL,
CONSTRAINT `job_views_jobs_id_fk` FOREIGN KEY (`job_id`) REFERENCES `jobs` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=79976587 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci COMMENT='New jobs views system'
This is the query:
SELECT COUNT(id) as view, job_id
from jobs_view_stats
WHERE jobs_view_stats.created_at between '2022-11-01 00:00:00' AND '2022-11-30 23:59:59'
GROUP BY jobs_view_stats.job_id
Execution plan:
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | jobs_view_stats | null | range | IDX_D05BC6799FDS15210,jobs_view_stats_created_at_id_index | jobs_view_stats_created_at_id_index | 5 | null | 1584610 | 100 | Using index condition; Using MRR; Using temporary; Using filesort |
This query takes 4m to complete I want to reduce it to take minimum time.
2
Answers
In your execution plan, you are returning 1584610 rows that are then grouped by, which in turn uses a temp table to sort through and group (slow).
jobs_view_stats_created_at_id_index contains also ‘id’ which will make the key cardinality excessively large, could try instead adding job_id to the key as that is what you are grouping by.
I think the main issue is your where clause returns over 1.5 million rows that have to all load into a temp table to then be re-read in full to be grouped by.
You need to bite off smaller chunks i think.
I’m going to assume you are using a programming lang to envoke the DB calls (like PHP), if so you could try
then when you have the job_id list
do smaller queries looping over the results from the first
or if there are many different job_id’s batch them
for a pure MySQL resolution, I would create a temp memory table of all the job_ids as a memory table using
Then
this is all totally untested so might be typos.
Change
COUNT(id)
toCOUNT(*)
since it does not need to checkid
for beingNOT NULL
Add
Either of those will be "covering". (I don’t know which is better; the Optimizer will know.)
Drop the index
(job_id)
, as being redundant.Unless you specifically need
(created_id, id)
, drop it.For even more performance, build and maintain a Summary Table . Then perform the query against it. Such a summary table might have
PRIMARY KEY(job_id, created_date)
and daily counts for each day for each job.