skip to Main Content

I’m having trouble understanding my options for how to optimize this specific query. Looking online, I find various resources, but all for queries that don’t match my particular one. From what I could gather, it’s very hard to optimize a query when you have an order by combined with a limit.

My usecase is that i would like to have a paginated datatable that displayed the latest records first.

The query in question is the following (to fetch 10 latest records):

select
    `xyz`.*
from
    xyz
where
    `xyz`.`fk_campaign_id` = 95870
    and `xyz`.`voided` = 0
order by
    `registration_id` desc
limit 10 offset 0

& table DDL:

CREATE TABLE `xyz` (
  `registration_id` int NOT NULL AUTO_INCREMENT,
  `fk_campaign_id` int DEFAULT NULL,
  `fk_customer_id` int DEFAULT NULL,
  ... other fields ...
  `voided` tinyint unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`registration_id`),
  .... ~12 other indexes ...
  KEY `activityOverview` (`fk_campaign_id`,`voided`,`registration_id` DESC)
) ENGINE=InnoDB AUTO_INCREMENT=280614594 DEFAULT CHARSET=utf8 COLLATE=utf8_danish_ci;

The explain on the query mentioned gives me the following:

"id","select_type","table","partitions","type","possible_keys","key","key_len","ref","rows","filtered","Extra"
1,SIMPLE,db_campaign_registration,,index,"getTop5,winners,findByPage,foreignKeyExistingCheck,limitReachedIp,byCampaign,emailExistingCheck,getAll,getAllDated,activityOverview",PRIMARY,"4",,1626,0.65,Using where; Backward index scan

As you can see it says it only hits 1626 rows. But, when i execute it – then it takes 200+ seconds to run.

I’m doing this to fetch data for a datatable that is to display the latest 10 records. I also have pagination that allows one to navigate pages (only able to go to next page, not last or make any big jumps).

To further help with getting the full picture I’ve put together a dbfiddle. https://dbfiddle.uk/Jc_K68rj – this fiddle does not have the same results as my table. But i suspect this is because of the data size that I’m having with my table.

The table in question has 120GB data and 39.000.000 active records. I already have an index put in that should cover the query and allow it to fetch the data fast. Am i completely missing something here?

2

Answers


  1. KEY `activityOverview` (`fk_campaign_id`,`voided`,`registration_id` DESC)
    

    is optimal. (Nearly as good is the same index, but without the DESC).

    Let’s see the other indexes. I strongly suspect that there is at least one index that is a prefix of that index. Remove it/them. The Optimizer sometimes gets confused and picks the "smaller" index instead of the "better index.

    Here’s a technique for seeing whether it manages to read only 10 rows instead of most of the table: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#handler_counts

    Login or Signup to reply.
  2. Another solution goes something like this:

    SELECT b.*
        FROM ( SELECT registration_id 
                 FROM xyz
                where  `xyz`.`fk_campaign_id` = 95870
                  and  `xyz`.`voided` = 0
                order by  `registration_id` desc
                limit  10 offset 0 ) AS a
        JOIN xyz AS b USING (registration_id)
        order by  `registration_id` desc;
    

    Explanation:

    • The derived table (subquery) will use the ‘best’ query without any extra prompting — since it is "covering".
    • That will deliver 10 ids
    • Then 10 JOINs to the table to get xyz.*
    • A derived table is unordered, so the ORDER BY does need repeating.

    That’s tricking the Optimizer into doing what it should have done anyway.

    (Again, I encourage getting rid of any indexes that are prefixes of the the 3-column, optimal, index discussed.)

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search