I have a "simple" query how takes 0.7678 seconds or more to be executed with MariaDB.
Here is the query:
select `referenceNumber`
from `invoice`
where `groupId` = 3550
and `referenceNumber` >= 301
order by `referenceNumber` desc
limit 1;
These columns have an index: "referenceNumber", "groupId"
Here is the result of an EXPLAIN
:
I found a solution by creating a subquery like that:
select `referenceNumber`
from (
SELECT id
from `invoice`
where `groupId` = 3550
and `referenceNumber` >= 301
) as subquery
JOIN invoice as invoice ON invoice.id = subquery.id
order by `referenceNumber` desc
limit 1;
This query takes like 0.0011 seconds.
Here is the result of an EXPLAIN:
Do you have an explanation about the poor performance of the first query?
Two surprising findings:
The query without the where `groupId` = 3550
takes only 0.0005 seconds like that:
select `referenceNumber`
from `invoice`
where `referenceNumber` >= 301
order by `referenceNumber` desc
limit 1;
The query without the order by `referenceNumber` desc
takes only 0.0011 seconds like that:
select `referenceNumber`
from `invoice`
where `groupId` = 3550
and `referenceNumber` >= 301
limit 1;
Here is the schema of this table:
CREATE TABLE `invoice` (
`id` int(10) UNSIGNED NOT NULL,
`groupId` int(11) NOT NULL,
`referenceNumber` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
COLLATE=utf8mb4_unicode_ci;
ALTER TABLE `invoice`
ADD PRIMARY KEY (`id`),
ADD KEY `invoice_groupid_index` (`groupId`),
ADD KEY `invoice_referencenumber_index` (`referenceNumber`);
ALTER TABLE `invoice`
MODIFY `id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT;
COMMIT;
Thank you really much for your help!
2
Answers
When it comes to indexes, having an index on column A and column B will not help if there’s a query involving both of them.
Adding an index on A creates a look-up table to records with various A values, and can provide support for
ORDER
,BETWEEN
and other operations involving ordered values. Importantly it does not account for the order of anything B related.Likewise, an index on B does much the same thing, ignoring the order of A.
In general, if you want to query
WHERE A=? ORDER BY B
then you need an index onA,B
. This creates an index with data sorted on A, then sub-sorted (for equal values of A) on B. This makes comparisons very quick, they can often happen entirely within the index.The subqueries are overkill. They may involve building a temp table, which adds cost. (There are other situations where this is actually optimal, but not here.)
The
LIMIT
without anORDER BY
gives you a random item; don’t do it that way. This is, in my opinion, not valid, so it does not matter how fast it is.For performance, the rest need you to replace
KEY(groupId)
withSome of your formulations lead the optimizer into a quandary of whether to use
KEY(groupId)
orKEY(referenceNumber)
. Each has its pros and cons. But the Optimizer does not have adequate information to choose between them. The composite index above will home in on the single row that is desired.Note: My suggested index is both "composite" and "covering". See those terms for further reference.