skip to Main Content

I have a table with several million rows of data. I have a primary key on id, and a compound unique key on col2, col3, col4 and my_date (called comp_indx). Sample data is shown here…

id   col2 col3 col4 my_date             col5 col6 col7
1    1    1    1    2020-01-03 02:00:00 a    1    a
2    1    2    1    2020-01-03 01:00:00 b    2    1
3    1    3    1    2020-01-03 03:00:00 c    3    b
4    2    1    1    2020-02-03 01:00:00 d    4    2
5    2    2    1    2020-02-03 02:00:00 e    5    c
6    2    3    1    2020-02-03 03:00:00 f    6    3
7    3    1    1    2020-03-03 03:00:00 g    7    d
8    3    2    1    2020-03-03 02:00:00 h    8    4
9    3    3    1    2020-03-03 01:00:00 i    9    e

If I perform the following query…

SELECT col2, col3, max(my_date)
FROM table
where col4=1 and my_date <= '2001-01-27'
group by col2, col3

…the query is very efficient, and running the explain command shows…

select_type type  key       key_len rows Extra
----------- ----- --------- ------- ---- -------------------------------------
SIMPLE      range comp_indx 11      669  Using where; Using index for group-by

However, if I run a similar command (only requesting more columns – none of which are part of an index), e.g…

SELECT col2, col3, max(my_date), col5, col7
FROM table
where col4=1 and my_date <= '2001-01-27'
group by col2, col3

…then the performance drops right off, and if I run the explain command again, I get…

select_type type  key       key_len rows     Extra
----------- ----- --------- ------- -------  -----------
SIMPLE      index comp_indx 11      5004953  Using where

I can see that the type has changed from range to index, and I can see that the index is no longer being used for the group-by.

I am trying to understand why this is happening, and more importantly, how can I fix this issue?

BTW the table definition is…

CREATE TABLE `my_table` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `col2` smallint(6) NOT NULL,
  `col3` smallint(6) NOT NULL,
  `col4` smallint(6) NOT NULL,
  `my_date` datetime NOT NULL,
  `col5` char(1) NOT NULL,
  `col6` char(1) NOT NULL,
  `col7` char(1) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `comp_indx` (`col2`,`col3`,`col4`,`my_date`)
) ENGINE=InnoDB;

5

Answers


  1. Chosen as BEST ANSWER

    I have now fixed my performance issue by using 2 selects on the same table and a join, e.g...

    SELECT *
    FROM (
        SELECT col2, col3, max(my_date) as max_date
        FROM table
        where col4=1 and my_date <= '2001-01-27'
        group by col2, col3
    ) aaa
    join
    (
        SELECT col2, col3, my_date, col5, col6, col7
        FROM table
        where col4=1
    ) bbb
    on (aaa.col2=bbb.col2 and aaa.col3=bbb.col3 and aaa.max_date=bbb.my_date);
    

  2. You probably need to add this covering index to make the second query faster :

    create index comp2_index on my_table(col2, col3, col4, my_date, col5, col7);
    
    Login or Signup to reply.
  3. Add the following index

    alter table my_table add key cl4_dt_cl2_cl3 (col4,my_date,col2,col3);
    

    Moreover , the following query is invalid if sql_mode only_full_group_by is enabled

    SELECT col2, col3, max(my_date), col5, col7
    FROM table
    where col4=1 and my_date <= '2001-01-27'
    group by col2, col3
    
    Login or Signup to reply.
  4. Your original query:

    SELECT col2, col3, max(my_date), col5, col7
    FROM table
    where col4=1 and my_date <= '2001-01-27'
    group by col2, col3
    

    col5, col7 should be added to the group by clause as well right?

    Login or Signup to reply.
  5. If you don’t need id for anything, then this this will speed up the query, regardless of the extra columns (col5/6/7) that you need to fetch.

    CREATE TABLE `my_table` (
      `col2` smallint(6) NOT NULL,
      `col3` smallint(6) NOT NULL,
      `col4` smallint(6) NOT NULL,
      `my_date` datetime NOT NULL,
      `col5` char(1) NOT NULL,
      `col6` char(1) NOT NULL,
      `col7` char(1) NOT NULL,
      PRIMARY KEY (col4,my_date,col2,col3)
    ) ENGINE=InnoDB;
    

    If you do need id because of it being referenced from other table(s), then add

      `id` int(11) NOT NULL AUTO_INCREMENT,
      INDEX(id)  -- This is sufficient to keep auto_inc happy
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search