skip to Main Content

Mysql 8 takes 2 minutes to sort an empty set

The following table on 8.0.33 MySQL Community Server

CREATE TABLE `eventiAcc` (
  `chiave` int NOT NULL AUTO_INCREMENT,
  `tipo` tinyint NOT NULL,
  `account` int DEFAULT NULL,
  `utente` int DEFAULT NULL,
  `timestp` datetime NOT NULL,
  `dato` varchar(50) DEFAULT NULL,
  `extra` varchar(200) DEFAULT NULL,
  PRIMARY KEY (`chiave`),
  KEY `idx1` (`account`,`tipo`,`timestp`),
  KEY `idx2` (`tipo`,`utente`),
  KEY `idx_timestp` (`timestp`)
) ENGINE=InnoDB AUTO_INCREMENT=40908800 DEFAULT CHARSET=utf8mb3

The table contains 20 million records.

if I run this query

 SELECT chiave, tipo, account, utente, timestp, dato, extra 
FROM eventiAcc 
WHERE utente=25169 
AND tipo=26 
AND dato='aggIndFatt';

the resulti is
Empty set (0.06 sec)
because the where clause is optimized on index KEY idx2 (tipo,utente)

but if I run

 SELECT chiave, tipo, account, utente, timestp, dato, extra 
FROM eventiAcc 
WHERE utente=25169 
AND tipo=26 
AND dato='aggIndFatt' 
ORDER BY timestp DESC LIMIT 1;

the resulti is
Empty set (2 min 8.12 sec)

I suspect that mysql server sorts the entire table before applying the where clause trying to optimize the SORT on index KEY idx_timestp (timestp) before applying where clause.

Other tests that seems to support my assumption:

SELECT chiave, tipo, account, utente, timestp, dato, extra FROM eventiAcc WHERE utente=25169 AND tipo=26;
+----------+------+---------+--------+---------------------+--------------+-----------------------------------------------------------------------------------------------+
| chiave   | tipo | account | utente | timestp             | dato         | extra                                                                                         |
+----------+------+---------+--------+---------------------+--------------+-----------------------------------------------------------------------------------------------+
| 12703973 |   26 |    4445 |  25169 | 2020-08-14 20:58:42 | aggCaratt    | Generali     

….

| 40908354 |   26 |    4445 |  25169 | 2024-01-29 11:39:10 | selectUser   | oldUser=null                                                                                  |
+----------+------+---------+--------+---------------------+--------------+-----------------------------------------------------------------------------------------------+
4826 rows in set (0.08 sec)

select chiave, tipo, account, utente, timestp, dato, extra FROM eventiAcc WHERE utente=25169 AND tipo=26 ORDER BY timestp DESC LIMIT 1 ;
+----------+------+---------+--------+---------------------+------------+--------------+
| chiave   | tipo | account | utente | timestp             | dato       | extra        |
+----------+------+---------+--------+---------------------+------------+--------------+
| 40908354 |   26 |    4445 |  25169 | 2024-01-29 11:39:10 | selectUser | oldUser=null |
+----------+------+---------+--------+---------------------+------------+--------------+
1 row in set (0.01 sec)

The strangest behavior is that if the I drop "KEY idx_timestp (timestp)" index and I lunch the same query that takes 2 minutes

 SELECT chiave, tipo, account, utente, timestp, dato, extra 
FROM eventiAcc 
WHERE utente=25169 
AND tipo=26 
AND dato='aggIndFatt' 
ORDER BY timestp DESC LIMIT 1 

he result is

Empty set (0.08 sec)

Dropping the KEY idx_timestp (timestp) index solves the problem but the index is needed in many different queroes on that table

It is any way to suggest Mysql 8 to ignore the index on timestp field?
Which is the best approach to solve this performance issue?

2

Answers


  1. I don’t think that "sorting" the "empty set" is the problem, but rather the time it takes to discover that the set is empty.

    A secondary index works in one of two ways:

    • "Using index" (see output from EXPLAIN SELECT) — This means the index was "covering". Discovering that there are zero rows is done entirely in the index’s BTree makes the query fast.

    • Not covering. Your first, slow, query could use only one column (tipo); it had to reach over into the data BTree to check the other two columns. This back-and-forth was quite costly in your case.

    The solution is to add another index with the 3 columns tested in WHERE clause

    Your second query would benefit from

    INDEX(utente, tipo, data,    -- in any order (since all tests are "=")
          timestp)               -- last
    

    This will be "covering" for both queries and make the sort go away. And it will probably avoid any need for "index hints"

    MySQL will almost never use two indexes in a single SELECT.

    Login or Signup to reply.
  2. MySql allows to ignore a specific index by adding IGNORE INDEX (index_name) after the table name.

    So in your case the query should become:

    SELECT chiave, tipo, account, utente, timestp, dato, extra 
    FROM eventiAcc 
    IGNORE INDEX (idx_timestp) 
    WHERE utente=25169 
    AND tipo=26 
    AND dato='aggIndFatt' 
    ORDER BY timestp DESC LIMIT 1 
    

    Just for reference, the following keywords also exist: USE INDEX and FORCE INDEX that allow respectively to suggest an index to the optimizer and force the optimizer to use a specific index.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search