skip to Main Content
# index: `review_count`,  `store_id`, `status`, `name`, `address`
EXPLAIN ANALYZE  SELECT 
            store0_.name,
        store0_.status,
        store0_.name,
        store0_.address,
        category1_.name

    from
        store store0_ 
    inner join
        category category1_ 
            on store0_.category_id=category1_.category_id 
    left outer join
        store_keyword storekeywo2_ 
            on store0_.store_id=storekeywo2_.store_id 
    left outer join
        keyword keyword3_ 
            on storekeywo2_.keyword_id=keyword3_.keyword_id 
    where
     store0_.review_count  >= 0 AND ####
        store0_.status='APPROVED'
        AND  store0_.name LIKE '가게%'
        AND keyword3_.name = '분위기 좋은'
      
ORDER BY store0_.review_count
   LIMIT 100
   OFFSET 8000

If you put the condition " store0_.review_count >= 0 AND", there is a performance improvement of about 10 times from 0.14ms to 0.014ms At this time, the actual time was measured using the execution plan of mysql (heidisql).

(I don’t know why there is such a difference in performance.)
Because the condition " store0_.review_count >= 0 AND" is true by all rows.
Why is there a 10x performance improvement?

2

Answers


  1. Probably a better index would be :

    CREATE INDEX X001 ON store (status, category_id, name, review_count, address)
    

    and some more :

    CREATE INDEX X002 ON keyword (name, keyword_id)
    CREATE INDEX X003 ON store_keyword (store_id, keywrd_id)
    
    Login or Signup to reply.
  2. This is not about performance optimization, but about readability / functionality of your query:

    • You outer join the stores’ keywords, so as to show stores without keywords? But then you have AND keyword3_.name = '분위기 좋은'. Outer joined rows have no keyword. It is null, so you dismiss all outer joined rows. It seems you really want an inner join, so use inner joins instead of outer joins.
    • Your query result shows stores and their categories. So, the keyword thing is just a condition. Make this clear by putting the condition in the WHERE clause instead of joining the tables. This also leaves no amiguity to the human reader whether your query might produce duplicates. (It shouldn’t, as with proper keys installed it should not be possible to link the same keyword to a store more than once.)
    • Is the review_count unique in your table? If not, the query result is not deterministic. It may be that row 7999, 8000 and 8001 have the same review count. And every time you run the query you may get one or the other row in your result. If you even ran the query before to get the first 8000 rows, you may have duplicates in the newly selected 1000 and miss others. Make your ORDER BY clause deterministic, e.g. ORDER BY store0_.review_count, store0_.store_id.

    And this is about performance:

    • Working with offsets is slow. The DBMS will have to sort all your potential result rows in order to find position 8000 and get the next 1000. So if your idea is to get the first 1000, then the next 1000, then the next 1000, etc., the DBMS will have to sort the table again and again and again. Instead remember how far you have got and only select rows from there on. Thus the intermediate result that must get sorted will get smaller and smaller with every new call.

    Here is how I might write this query:

    select 
      s.name,
      s.status,
      s.address,
      c.name
    from store s
    inner join category c on c.category_id = s.category_id 
    where s.review_count >= 0
    and s.status = 'APPROVED'
    and s.name like '가게%'
    and s.store_id in
    (
      select sk.store_id
      from store_keyword sk 
      where sk.keyword_id = 
      (
        select k.keyword_id 
        from keyword k
        where k.name = '분위기 좋은'
      )
    )
    and (s.review_count, s.store_id) > (@last_review_count, @last_store_id)
    order by s.review_count, s.store_id
    limit 1000;
    

    Being able to compare tuples with > came with MySQL 8. If your are using an old version that does not yet support this, replace the final condition by this:

    and 
    (
      s.review_count > @last_review_count
      or 
      (s.review_count = @last_review_count and s.store_id > @last_store_id)
    )
    

    As to indexes: You must of course be able to find a keyword quickly

    create unique index idx1 on keyword (name, keyword_id);
    create unique index idx2 on keyword (keyword_id, store_id);
    

    These are indexes that you are likely to have already.

    Then you want to index your store table, but there are four columns that are limiting the rows to access:

    • review_count >= 0
    • status = ‘APPROVED’
    • name LIKE ‘가게%’
    • store_id in the set of those with keyword ‘분위기 좋은’

    Now you want an index where you get to the core quicky. If 90% of the rows are ‘APPROVED’, then the status will not be very helpful. If only 0.9% of the rows are ‘APPROVED’, then it will be. If most review counts are negative, then this is very selective, if most are zero or positive, it ain’t. Do you expect many stores with the keyword or few? Build an index with the most selective column first, the second most selective second and so on. E.g.:

    create index idx3 on store (status, store_id, name, review_count);
    

    If you are insecure, create many indexes with different combinations:

    create index idx4 on store (review_count, status, name, store_id);
    ...
    

    Then check which ones are being used and drop the others.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search