skip to Main Content

I have a "simple" query how takes 0.7678 seconds or more to be executed with MariaDB.

Here is the query:

select  `referenceNumber`
    from  `invoice`
    where  `groupId` = 3550
      and  `referenceNumber` >= 301
    order by  `referenceNumber` desc
    limit  1;

These columns have an index: "referenceNumber", "groupId"

Here is the result of an EXPLAIN:
enter image description here

I found a solution by creating a subquery like that:

select  `referenceNumber`
    from  (
        SELECT  id
            from  `invoice`
            where  `groupId` = 3550
              and  `referenceNumber` >= 301
          ) as subquery
    JOIN  invoice as invoice  ON invoice.id = subquery.id
    order by  `referenceNumber` desc
    limit  1;

This query takes like 0.0011 seconds.

Here is the result of an EXPLAIN:
enter image description here

Do you have an explanation about the poor performance of the first query?

Two surprising findings:

The query without the where `groupId` = 3550 takes only 0.0005 seconds like that:

select  `referenceNumber`
    from  `invoice`
    where  `referenceNumber` >= 301
    order by  `referenceNumber` desc
    limit  1;

The query without the order by `referenceNumber` desc takes only 0.0011 seconds like that:

select  `referenceNumber`
    from  `invoice`
    where  `groupId` = 3550
      and  `referenceNumber` >= 301
    limit  1;

Here is the schema of this table:

CREATE TABLE `invoice` (
  `id` int(10) UNSIGNED NOT NULL,
  `groupId` int(11) NOT NULL,
  `referenceNumber` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
            COLLATE=utf8mb4_unicode_ci;

ALTER TABLE `invoice`
  ADD PRIMARY KEY (`id`),
  ADD KEY `invoice_groupid_index` (`groupId`),
  ADD KEY `invoice_referencenumber_index` (`referenceNumber`);

ALTER TABLE `invoice`
  MODIFY `id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT;
COMMIT;

Thank you really much for your help!

2

Answers


  1. When it comes to indexes, having an index on column A and column B will not help if there’s a query involving both of them.

    Adding an index on A creates a look-up table to records with various A values, and can provide support for ORDER, BETWEEN and other operations involving ordered values. Importantly it does not account for the order of anything B related.

    Likewise, an index on B does much the same thing, ignoring the order of A.

    In general, if you want to query WHERE A=? ORDER BY B then you need an index on A,B. This creates an index with data sorted on A, then sub-sorted (for equal values of A) on B. This makes comparisons very quick, they can often happen entirely within the index.

    Login or Signup to reply.
    • The subqueries are overkill. They may involve building a temp table, which adds cost. (There are other situations where this is actually optimal, but not here.)

    • The LIMIT without an ORDER BY gives you a random item; don’t do it that way. This is, in my opinion, not valid, so it does not matter how fast it is.

    • For performance, the rest need you to replace KEY(groupId) with

        INDEX(groupId, referenceNumber)  -- in this order
      
    • Some of your formulations lead the optimizer into a quandary of whether to use KEY(groupId) or KEY(referenceNumber). Each has its pros and cons. But the Optimizer does not have adequate information to choose between them. The composite index above will home in on the single row that is desired.

    • Note: My suggested index is both "composite" and "covering". See those terms for further reference.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search