I have a join query which takes a lot of time to process.
SELECT
COUNT(c.id)
FROM `customers` AS `c`
LEFT JOIN `setting` AS `ssh` ON `c`.`shop_id` = `ssh`.`id`
LEFT JOIN `customer_extra` AS `cx` ON `c`.`id` = `cx`.`customer_id`
LEFT JOIN `customers_address` AS `ca` ON `ca`.`id` = `cx`.`customer_default_address_id`
LEFT JOIN `lytcustomer_tier` AS `ct` ON `cx`.`lyt_customer_tier_id` = `ct`.`id`
WHERE (c.shop_id = '12121') AND ((DATE(cx.last_email_open_date) > '2019-11-08'));
This is primarily because the table ‘customers’ has 2 million records.
I could go over into indexing etc. But, the larger point is, this 2.5 million could become a billion records 1 day.
I’m looking for solutions which can enhance performance.
I’ve given thought to
a) horizontal scalability. -: distribute the mysql table into different sections and query the count independently.
b) using composite indexes.
c) My favourite one -: Just create a seperate collection in mongodb or redis which only houses the count(output of this query) Since, the count is just 1 number. this will not require a huge size aka better query performance (Only question is, how many such queries are there, because that will increase size of the new collection)
2
Answers
Try this and see if it improve performance:
As I mention in the comment, since the condition
AND ((DATE(cx.last_email_open_date) > '2019-11-08'));
, already madecustomers
table toINNER JOIN
withcustomer_extra
table, you might just change it toINNER JOIN customer_extra AS cx ON c.id = cx.customer_id
and follow it with otherLEFT JOIN
.The
INNER JOIN
will at least get the initial result to only return any customer who have last_email_open_date value based on what has been specified.COUNT(*)
, notCOUNT(c.id)
Remove these; they slow down the query without adding anything that I can see:
DATE(...)
makes that test not “sargable”. This works forDATE
orDATETIME
; and this is much faster:Consider whether that should be
>=
instead of>
.shop_id
. (Please provideSHOW CREATE TABLE
.)LEFT JOIN
whenJOIN
would work equally well.If
customer_extra
is columns that should have been incustomer
, now is the time to move them in. That would let you use this composite index for even more performance:With those changes, a billion rows in MySQL will probably not be a problem. If it is, there are still more fixes I can suggest.