skip to Main Content

I am using using MySQL (InnoDB engine). I am running SQL query with WHERE IN statement which is also ordered.

I decided to create the perfect index, so my query will not run slow. Surprisingly, after creating the perfect index MySQL is still using filesort when running my query.

My Query:

SELECT *
    FROM `events`
    WHERE
        `events`.`user_id` IN (118, 105, 126, 196, 338)
    ORDER BY
        `date` ASC,
        `planing_hour_id` ASC;

My perfect Index:

ALTER TABLE `events`
    ADD INDEX `my_perfect_index` (
        `user_id`,
        `date` ASC,
        `planing_hour_id` ASC
    );

Output of EXPLAIN:
EXPLAIN SQL Query Screenshot

Problem:

MySQL still uses filesort even when there is perfect index available. I would expect only Using Index condition to be present in output of EXPLAIN query. Why is this not the case?

Disclaimer:

I checked similar topics and none of them has been useful:

2

Answers


  1. You can eliminate "Using filesort" if MySQL’s natural order of reading the rows matches the order you request in your ORDER BY. Thus there is no sorting needed, because they are already read in the right order.

    MySQL always reads in index order. Whether it uses a secondary index like yours, or the primary key (also called the clustered index), rows will be read in the order they are stored in that index. So if that order does not match your ORDER BY, then a filesort is required.

    (Note "filesort" doesn’t necessarily mean it will be on disk. If there are few rows in the result, the sorting can be done in RAM relatively quickly.)

    So why doesn’t your index count as a "perfect" index?

    Think of this analogy:

    SELECT ... FROM TelephoneBook 
    WHERE last_name IN ('Addams', 'Kirk', 'Smith') 
    ORDER BY first_name;
    

    The result:

    last_name first_name
    Addams Morticia
    Kirk James
    Smith Sarah Jane

    These are read in index order by last_name. But they are not implicitly in order by first_name, the order requested in the query. The query needs a phase to filesort the set of matching rows.

    If you were to make the "perfect" index this way:

    ALTER TABLE `TelephoneBook`
        ADD INDEX `my_perfect_index` (
            `first_name`,
            `last_name`
        );
    

    The query might use this index, and eliminate the filesort, because it’s reading the rows in order by first_name. However, this can’t also do the search to find rows matching last_name most efficiently.

    The best it can do is index condition pushdown, which is to tell the storage engine not to return rows that don’t match the index — but the storage engine does have to examine all the indexes to evaluate them. This is not as good for optimization as reading from an index leading with last_name.

    Ultimately, there is no perfect index in cases where the query must search with a range condition (IN() qualifies as a range condition), and also do a sort that doesn’t match the column of the range condition.

    Login or Signup to reply.
  2. If your concern is high performance you can try rewriting your query as:

    select * from events where user_id = 118
    union all select * from events where user_id = 105
    union all select * from events where user_id = 116
    union all select * from events where user_id = 196
    union all select * from events where user_id = 338
    order by date, planning_hour_id
    

    This form effectivelly seeks the index five times (albeit hitting the secondary and primary index each time), then unions the result sets, and finally sort them.

    In this case a simple index is useful, since the sorting won’t use the index and will require materialization. Nothing bad with it unless the query ends up selecting thousands of rows.

    create index i1 on events (user_id);
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search