skip to Main Content

I have a query that returns combined locations e.x Arizona, California from the locations table only where each individual location Arizona | Yes and California | Yes has been selected by the user in the users table. It works, but it’s really slow. Is there a faster way to write this?

SELECT *
FROM locations
WHERE combined_locations NOT IN (
    SELECT DISTINCT combined_locations
    FROM locations
    JOIN users ON locations.combined_locations REGEXP users.user_locations
    WHERE users.user_selection = 'No'
    AND users.user_id = 1
)
AND combined_locations IN (
    SELECT DISTINCT combined_locations
    FROM locations
    JOIN users ON locations.combined_locations REGEXP users.user_locations
    WHERE users.user_selection = 'Yes'
    AND users.user_id = 1
);

I’ve read that REGEXP is slow, so I’ve tried LIKE CONCAT('%', users.user_locations, '%') but it was only marginally faster.

I have also tried various indexing combinations but they didn’t seem to get much improvement.

Fiddle: https://dbfiddle.uk/-Hb1rqTm

I’m still learning how to write good questions, so hopefully this is better than my first post.

2

Answers


  1. You might get faster performance using FIND_IN_SET and a couple of joins:

    SELECT DISTINCT locations.*
    FROM locations
    INNER JOIN users users_include ON
      users_include.user_id = 1      
      AND FIND_IN_SET(users_include.user_locations, locations.combined_locations) > 0
      AND users_include.user_selection = 'Yes'
    LEFT JOIN users users_exclude ON
      users_exclude.user_id = 1
      AND FIND_IN_SET(users_exclude.user_locations, locations.combined_locations) > 0
      AND users_exclude.user_selection = 'No'
    WHERE users_exclude.user_id IS NULL
    

    Note that the logic implemented here is "at least one location with user_selection = ‘Yes’ and no location with user_selection = ‘No’" – I think this is the same as the logic implemented by your query.

    If the logic instead should be "all locations with user_selection = ‘Yes’ and no location with user_selection = ‘No’", that can be accomplished with a GROUP BY clause and a HAVING clause (with a count):

    SELECT locations.*
    FROM locations
    INNER JOIN users users_include ON
      users_include.user_id = 1
      AND FIND_IN_SET(users_include.user_locations, locations.combined_locations) > 0
      AND users_include.user_selection = 'Yes'
    LEFT JOIN users users_exclude ON
      users_exclude.user_id = 1
      AND FIND_IN_SET(users_exclude.user_locations, locations.combined_locations) > 0
      AND users_exclude.user_selection = 'No'
    WHERE users_exclude.user_id IS NULL
    GROUP BY locations.id, locations.column1, locations.column2, locations.combined_locations
    HAVING (1 + CHAR_LENGTH(combined_locations) - CHAR_LENGTH(REPLACE(combined_locations, ',', ''))) = COUNT(users_include.user_id)
    
    Login or Signup to reply.
  2. SELECT locations.id, locations.combined_locations
    FROM users
    JOIN locations ON FIND_IN_SET(users.user_locations, locations.combined_locations)
    WHERE users.user_id = 1
    GROUP BY 1, 2
    HAVING SUM(users.user_selection = 'No') = 0
       AND SUM(users.user_selection = 'Yes') = 1 + LENGTH(locations.combined_locations) - LENGTH(REPLACE(locations.combined_locations, ',', ''))
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search