Edit:
EXPLAIN result for the first two queries when using src_cus_lang:
EXPLAIN result for the last two queries when using cust_lang_src:
Original post:
I’m using MySQL and have a table as follows:
CREATE TABLE tm(
id INT AUTO_INCREMENT PRIMARY KEY,
source TEXT(7000) NOT NULL,
target TEXT(6000) NOT NULL,
language CHAR(3),
customer VARCHAR(10))
When using composite index (source(755), customer, language)
, the fetch times are as follows:
SELECT source FROM tm WHERE customer = 'Customer A' AND language = 'ENU' AND source LIKE '%.net%'; -- 0.015s
SELECT source FROM tm WHERE source LIKE '%.net%' AND customer = 'Customer A' AND language = 'ENU'; -- 0.015s
However, when using composite index (customer, language, source(755))
, the fetch times increased drastically:
SELECT source FROM tm WHERE customer = 'Customer A' AND language = 'ENU' AND source LIKE '%.net%'; -- 1.7s
SELECT source FROM tm WHERE source LIKE '%.net%' AND customer = 'Customer A' AND language = 'ENU'; -- 1.7s
This is contrary to what I found somewhere on the Internet, which claims that the leftmost prefix rule is applied until <
, >
, BETWEEN
or LIKE
is met. According to this claim, the (customer, language, source(755))
index would be faster than (source(755), customer, language)
index regarding fetch time, since source
is used with LIKE
. I can’t post the source, because it’s not in English. Is this claim wrong? If not, why did I see contrary results?
2
Answers
Your condition
source LIKE '%.net%'
cannot use an index. The wildcard at the start of your pattern makes the index useless, because the index entries are sorted from the start of the string.By analogy, think of a telephone book. It’s sorted by names. Can you find entries with "net" in the middle of the name efficiently? No, because the sort order of the book doesn’t help at all.
Since that condition cannot use an index, the index on
(source(755), customer, language)
cannot be used. It can’t use the first column, so it can’t use any subsequent columns.The index on
(customer, language, source(755))
can be used, but only for the first two columns. TheLIKE
condition still ignores the index, but at least the search can be narrowed down using the first two columns.The telephone book analogy may help here too. If you search for someone with last name ‘Smith’ and whose first name contains "net" anywhere within the name, then at least you can limit your search to a subset of the book, people whose last name matches your condition on last name. But the first name still cannot be optimized; you just have to read through all the Smiths and evaluate them one by one.
(In addition to what Bill says…)
This may be the best:
Probably your first two tries were "fast" because they ignored the index. Use
EXPLAIN SELECT ...
to verify.