I have a query that returns combined locations e.x Arizona, California
from the locations table only where each individual location Arizona | Yes
and California | Yes
has been selected by the user in the users table. It works, but it’s really slow. Is there a faster way to write this?
SELECT *
FROM locations
WHERE combined_locations NOT IN (
SELECT DISTINCT combined_locations
FROM locations
JOIN users ON locations.combined_locations REGEXP users.user_locations
WHERE users.user_selection = 'No'
AND users.user_id = 1
)
AND combined_locations IN (
SELECT DISTINCT combined_locations
FROM locations
JOIN users ON locations.combined_locations REGEXP users.user_locations
WHERE users.user_selection = 'Yes'
AND users.user_id = 1
);
I’ve read that REGEXP is slow, so I’ve tried LIKE CONCAT('%', users.user_locations, '%')
but it was only marginally faster.
I have also tried various indexing combinations but they didn’t seem to get much improvement.
Fiddle: https://dbfiddle.uk/-Hb1rqTm
I’m still learning how to write good questions, so hopefully this is better than my first post.
2
Answers
You might get faster performance using
FIND_IN_SET
and a couple of joins:Note that the logic implemented here is "at least one location with user_selection = ‘Yes’ and no location with user_selection = ‘No’" – I think this is the same as the logic implemented by your query.
If the logic instead should be "all locations with user_selection = ‘Yes’ and no location with user_selection = ‘No’", that can be accomplished with a
GROUP BY
clause and aHAVING
clause (with a count):