skip to Main Content

I have a big table (~30M rows), it usually run fast( 5-6ms per request). Sometimes one request take a lot of time (something like 60sec).

Here the table structure :

CREATE TABLE table (
  id int(11) NOT NULL,
  A int(11) NOT NULL,
  B varchar(32) NOT NULL,
  C tinyint(1) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

ALTER TABLE `table`
  ADD PRIMARY KEY (id),
  ADD KEY A (A),
  ADD KEY B (B);

Cardinality is 2M for index A and 1k for index B.

Here my request :

SELECT * FROM table
WHERE A = someAvalue
    AND B = 'some B value'
    AND C = 0
ORDER BY id DESC
LIMIT 1;

And the result of explain :

id|select_type|table|type|possible_keys|key|key_len|ref  |rows|Extra 
1 |SIMPLE     |table|ref |A,B          |B  |34     |const|1   |Using index condition; Using where; Using filesort 

key_len and rows are not relevent on this case because it’s from an exemple query

If i remove AND C = 0 request run a normal amount of time. It seems that this request go wild when index A and B return no value.

So here my question : why adding a non indexed field in this case can make the request run in 60+ sec instead of some millisecond ?

2

Answers


  1. Why my select with indexed field and non-indexed field sometimes take
    long time?

    The first important issue is the usage of MyISAM storage Engine, please switch to InnoDB. MyISAM supports table lock only even for select statements.

    See What are the main differences between InnoDB and MyISAM?

    Secondly add the following index on your table.

    alter table your_table add index c_a_b(C,A,B)
    
    Login or Signup to reply.
  2. There are a few potential reasons that could explain this behavior:

    • Loss of Index Efficiency: Your query is initially using indexes on columns A and B. However, when you add the non-indexed condition C = 0, the database can no longer efficiently filter rows using just the indexes. It has to check the C column for each row that matches the conditions on A and B. This process can be time-consuming, particularly if the number of rows matching A and B is large.

    • Full Table Scan or Row-by-Row Filtering: Since C is not indexed, the database might perform a full table scan or a row-by-row check after filtering with A and B, significantly increasing the query time, especially for a large table like yours.

    • Query Execution Plan Changes: The inclusion of a non-indexed field can alter the execution plan generated by the database’s query optimizer. The optimizer might choose a less efficient path due to the added complexity of filtering on a non-indexed field.

    • Filesort for Ordering: The EXPLAIN output indicates the use of "Using filesort," which is an operation that can be slow, especially if it’s working on a large intermediate result set. The sorting operation becomes more complex with the additional non-indexed field.

    • Inefficient Use of Resources: When you add an extra condition, the query might require more memory and CPU time to process the additional filtering and sorting, which could slow down the execution in resource-constrained environments.

    • Data Skew: If the distribution of values in C is skewed (i.e., very few rows have C = 0), the database might end up scanning a significant portion of the table to find the relevant rows, leading to longer execution times.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search