skip to Main Content

I am trying to use covering index (with a functional index) in MySQL 8.0.28 to speed up a group by query on a large InnoDB table.

Covering index created:

ALTER TABLE `sp_files`
ADD INDEX `ix_site_header_to_cleaned_text2` (`sp_site_foreign_key`, (length(`cleaned_text`)));

Using only the fields from the covering index I would expect that MySQL uses the index instead of the underlying table – so I would expect the Extra column of the output of the EXPLAIN to show Using index:

EXPLAIN
SELECT sp_site_foreign_key,
    count(case when length(`cleaned_text`) > 100 then 1 else NULL END) as num1,
    count(1) as num2
FROM sp_files USE INDEX (`ix_site_header_to_cleaned_text2`)
GROUP by sp_site_foreign_key; 
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE sp_files index ix_site_header_to_cleaned_text2 ix_site_header_to_cleaned_text2 208 105249 100

Any idea why it is not using the index?
All fields of the query are contained in the ix_site_header_to_cleaned_text2 index, so I would expect the explain to show Using index in the Extra column.

2

Answers


  1. I’m not a MySQL professional, but what if you surround ix_site_header_to_cleaned_text2 with the “ inside FROM sp_files USE INDEX (ix_site_header_to_cleaned_text2)?

    Login or Signup to reply.
  2. It is using the index (hence the type: index join type shown in the EXPLAIN report), but it can’t gain the covering index optimization.

    MySQL’s support for functional indexes has some limitations that you might not notice unless you read the manual very carefully:

    https://dev.mysql.com/doc/refman/8.0/en/generated-column-index-optimizations.html

    The optimizer can use indexes on generated columns to generate execution plans, even in the absence of direct references in queries to those columns by name. This occurs if the WHERE, ORDER BY, or GROUP BY clause refers to an expression that matches the definition of some indexed generated column.

    This means it’s supposed to allow you to use the same expression as the functional part of the index (in this case length(cleaned_text)), and automatically get that value from the index, so it doesn’t have to recalculate it.

    But this doesn’t work if you use that expression in other clauses of the query, such as the select-list in your case, or in the JOIN clause. It doesn’t implement this optimization in other clauses than those named in the manual description.

    Perhaps they will improve support for functional indexes further in future versions of MySQL.

    In the meantime, you may get the covering index only if you define a virtual column explicitly, and reference that column in the index and in your query.

    Demo, tested on MySQL 8.0.36:

    CREATE TABLE sp_files (
      id SERIAL PRIMARY KEY,
      sp_site_foreign_key INT,
      cleaned_text TEXT,
      other INT,
      length_cleaned_text INT AS (length(cleaned_text)) 
    );
    
    ALTER TABLE `sp_files`
    ADD INDEX `ix_site_header_to_cleaned_text2` (`sp_site_foreign_key`, length_cleaned_text);
    
    EXPLAIN
    SELECT sp_site_foreign_key,
        count(case when length_cleaned_text > 100 then 1 else NULL END) as num1,
        count(1) as num2
    FROM sp_files USE INDEX (`ix_site_header_to_cleaned_text2`)
    GROUP by sp_site_foreign_keyG
    

    Output:

    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: sp_files
       partitions: NULL
             type: index
    possible_keys: ix_site_header_to_cleaned_text2
              key: ix_site_header_to_cleaned_text2
          key_len: 10
              ref: NULL
             rows: 1
         filtered: 100.00
            Extra: Using index
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search