# index: `review_count`, `store_id`, `status`, `name`, `address`
EXPLAIN ANALYZE SELECT
store0_.name,
store0_.status,
store0_.name,
store0_.address,
category1_.name
from
store store0_
inner join
category category1_
on store0_.category_id=category1_.category_id
left outer join
store_keyword storekeywo2_
on store0_.store_id=storekeywo2_.store_id
left outer join
keyword keyword3_
on storekeywo2_.keyword_id=keyword3_.keyword_id
where
store0_.review_count >= 0 AND ####
store0_.status='APPROVED'
AND store0_.name LIKE '가게%'
AND keyword3_.name = '분위기 좋은'
ORDER BY store0_.review_count
LIMIT 100
OFFSET 8000
If you put the condition " store0_.review_count >= 0 AND", there is a performance improvement of about 10 times from 0.14ms to 0.014ms At this time, the actual time was measured using the execution plan of mysql (heidisql).
(I don’t know why there is such a difference in performance.)
Because the condition " store0_.review_count >= 0 AND" is true by all rows.
Why is there a 10x performance improvement?
2
Answers
Probably a better index would be :
and some more :
This is not about performance optimization, but about readability / functionality of your query:
AND keyword3_.name = '분위기 좋은'
. Outer joined rows have no keyword. It is null, so you dismiss all outer joined rows. It seems you really want an inner join, so use inner joins instead of outer joins.WHERE
clause instead of joining the tables. This also leaves no amiguity to the human reader whether your query might produce duplicates. (It shouldn’t, as with proper keys installed it should not be possible to link the same keyword to a store more than once.)review_count
unique in your table? If not, the query result is not deterministic. It may be that row 7999, 8000 and 8001 have the same review count. And every time you run the query you may get one or the other row in your result. If you even ran the query before to get the first 8000 rows, you may have duplicates in the newly selected 1000 and miss others. Make yourORDER BY
clause deterministic, e.g.ORDER BY store0_.review_count, store0_.store_id
.And this is about performance:
Here is how I might write this query:
Being able to compare tuples with
>
came with MySQL 8. If your are using an old version that does not yet support this, replace the final condition by this:As to indexes: You must of course be able to find a keyword quickly
These are indexes that you are likely to have already.
Then you want to index your store table, but there are four columns that are limiting the rows to access:
Now you want an index where you get to the core quicky. If 90% of the rows are ‘APPROVED’, then the status will not be very helpful. If only 0.9% of the rows are ‘APPROVED’, then it will be. If most review counts are negative, then this is very selective, if most are zero or positive, it ain’t. Do you expect many stores with the keyword or few? Build an index with the most selective column first, the second most selective second and so on. E.g.:
If you are insecure, create many indexes with different combinations:
Then check which ones are being used and drop the others.