skip to Main Content

Edit:
EXPLAIN result for the first two queries when using src_cus_lang:
enter image description here
EXPLAIN result for the last two queries when using cust_lang_src:
enter image description here

Original post:
I’m using MySQL and have a table as follows:

CREATE TABLE tm(
    id INT AUTO_INCREMENT PRIMARY KEY,
    source TEXT(7000) NOT NULL,
    target TEXT(6000) NOT NULL,
    language CHAR(3),
    customer VARCHAR(10))

When using composite index (source(755), customer, language), the fetch times are as follows:

SELECT source FROM tm WHERE customer = 'Customer A' AND language = 'ENU' AND source LIKE '%.net%';  -- 0.015s
SELECT source FROM tm WHERE source LIKE '%.net%' AND customer = 'Customer A' AND language = 'ENU';  -- 0.015s

However, when using composite index (customer, language, source(755)), the fetch times increased drastically:

SELECT source FROM tm WHERE customer = 'Customer A' AND language = 'ENU' AND source LIKE '%.net%';  -- 1.7s
SELECT source FROM tm WHERE source LIKE '%.net%' AND customer = 'Customer A' AND language = 'ENU';  -- 1.7s

This is contrary to what I found somewhere on the Internet, which claims that the leftmost prefix rule is applied until <, >, BETWEEN or LIKE is met. According to this claim, the (customer, language, source(755)) index would be faster than (source(755), customer, language) index regarding fetch time, since source is used with LIKE. I can’t post the source, because it’s not in English. Is this claim wrong? If not, why did I see contrary results?

2

Answers


  1. Your condition source LIKE '%.net%' cannot use an index. The wildcard at the start of your pattern makes the index useless, because the index entries are sorted from the start of the string.

    By analogy, think of a telephone book. It’s sorted by names. Can you find entries with "net" in the middle of the name efficiently? No, because the sort order of the book doesn’t help at all.

    Since that condition cannot use an index, the index on (source(755), customer, language) cannot be used. It can’t use the first column, so it can’t use any subsequent columns.

    The index on (customer, language, source(755)) can be used, but only for the first two columns. The LIKE condition still ignores the index, but at least the search can be narrowed down using the first two columns.

    The telephone book analogy may help here too. If you search for someone with last name ‘Smith’ and whose first name contains "net" anywhere within the name, then at least you can limit your search to a subset of the book, people whose last name matches your condition on last name. But the first name still cannot be optimized; you just have to read through all the Smiths and evaluate them one by one.

    Login or Signup to reply.
  2. (In addition to what Bill says…)

    This may be the best:

    PRIMARY KEY(customer, language, id),  -- to optimize the query (and maybe others)
    INDEX(id)   -- to keep AUTO_INCREMENT happy
    

    Probably your first two tries were "fast" because they ignored the index. Use EXPLAIN SELECT ... to verify.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search