I have the following query:
SELECT *
FROM customers_cards
WHERE id NOT IN ( SELECT card_id FROM customers_card_psp)
It takes about 2sec to get executed and we need this query to get called on a highly visited page. Any idea how can I make it more optimize?
Here is the result of EXPLAIN
:
Noted that card_id
is not unique (there are 5 rows inside customers_card_psp
per 1 row inside customers_cards
), but adding DISTINCT
in the behind of card_id
doesn’t make anything better.
Noted that, using LEFT JOIN
has the same performance:
SELECT cc.*
FROM customers_cards cc
LEFT JOIN customers_card_psp ccp ON ccp.card_id = cc.id
WHERE ccp.id IS null
2
Answers
As presented, this query needs to read EVERY SINGLE ROW in customer_cards. Hence either:
Your "highly visited page" is displaying hundreds of thousands of rows
there are other filtering predicates you’ve not told us about
most of the records in customer_cards are excluded from the result set on the grounds of having corresponding records in customer_car_psp
For scenario 1 this is bad UX design – which you can’t fix in your database. But if really is the right way to solve the problem, consider caching outside the DBMS.
For scenario 2 we can’t help you tune a query you have not shown us (nor have you included the schema)
For scenario 3 you need to redesign your table structure
These are 3 ways to say the same thing:
vs
vs
The
NOT IN ( SELECT ... )
is notoriously inefficient and should be avoided.The
EXISTS
andLEFT JOIN
are probably compiled identically, hence equally good.The
EXISTS
is called "semi-join" meaning that it only needs to find 0 or 1 row, not all 5, in order to satisfy the test.Related: The comparisons are probably the same without the
NOT
(and changing toIS NULL
). That is, do not useIN
; use either of the others.(Different versions of MySQL/MariaDB have optimized these things differently.)