skip to Main Content

I have a simple subquery in MySQL which produces a set of codes, something like ‘LPR’,’LFR’,’LFL’ (more codes can be returned).
Using this in my main query slows it right down, like it can take a minute or 2 to run.

I’m unable to show the full code, and there are over 2,000,000 rows in the orders table but, it’s basically:

SubQuery:

(SELECT DISTINCT prefix 
 FROM `photo` 
 WHERE thedate BETWEEN '2024/02/10' AND '2024/02/20' 
   AND user = 'ABCD1')

MainQuery:

SELECT
field1,
field2
FROM `orders`
WHERE ((orders.buyers_date) BETWEEN '2024/02/01' AND '2024/02/29') AND orders.user = 'ABCD1' 
AND orders.prefix IN(
    (SELECT DISTINCT prefix FROM `photo` WHERE thedate BETWEEN '2024/02/10' AND '2024/02/20' AND user = 'ABCD1')
) 

If within the MainQuery I replace the IN with the codes directly, as in

SELECT
field1,
field2
WHERE ((buyers_date) BETWEEN '2024/02/01' AND '2024/02/29') AND orders.user = 'ABCD1' 
AND orders.prefix IN('LPR','LFR','LFL') 

The query takes around 6 seconds. I thought the subquery would have run first, then the main query. This doesn’t seem to be the case.

I did try :

SET @prefs = (SELECT DISTINCT prefix FROM `photo` WHERE thedate BETWEEN '2024/02/10' AND '2024/02/20' AND user = 'ABCD1')

SELECT @prefs;

The result is "MySQL returned an empty result set (i.e. zero rows)", it’s Lying!

Any ideas?

2

Answers


  1. Use WHERE EXISTS:

    SELECT field1, field2
    FROM orders
    WHERE orders.buyers_date BETWEEN '2024/02/01' AND '2024/02/29'
      AND orders.user = 'ABCD1' 
      AND EXISTS ( SELECT NULL
                   FROM photo
                   WHERE photo.thedate BETWEEN '2024/02/10' AND '2024/02/20' 
                     AND photo.user = 'ABCD1'
                     AND orders.prefix = photo.prefix
                   );
    

    PS. Your date literals have wrong format (from MySQL looking point) – do you use VARCHAR datatype instead of DATE?

    Login or Signup to reply.
  2. Add these indexes — The order of the columns in these composite orders is important:

    photo:   INDEX(user, theDate, prefix)
    orders:  INDEX(user, prefix, buyers_date)
    orders:  INDEX(prefix, user, buyers_date)
    

    IN ( SELECT ... ) is rarely optimal.

    This may be the best way to structure your query:

    SELECT  o.field1, o.field2
        FROM  ( SELECT  DISTINCT prefix
                    FROM  `photo`
                    WHERE  thedate BETWEEN '2024/02/10' AND '2024/02/20'
                      AND  user = 'ABCD1') 
              ) AS p
        JOIN `orders` AS o  ON o.prefix = p.prefix
        WHERE  o.buyers_date BETWEEN '2024/02/01' AND '2024/02/29'
          AND  orders.user = 'ABCD1'
    

    (I don’t know which of my recommendations for index for orders is better. By having both, the Optimizer will pick the one it thinks is better. The choice may depend on the distribution of the data, so it may change. See EXPLAIN SELECT...)

    Also, I don’t know whether Akina’s suggestion will be better. But his formulation needs:

    photo:   INDEX(user, prefix, theDate)
    orders:  INDEX(user, buyers_date, prefix)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search