I am working with a MySQL database and I have a column (let’s call it status_column) which contains comma-separated values. The possible values in this column are ‘NA’, ‘NON_NA’, ‘PSG_REQUIRED’, or any combination of these, such as ‘NA,PSG_REQUIRED’ or ‘PSG_REQUIRED,NON_NA’, etc.
I need to write efficient queries to retrieve rows based on the following conditions:
- Rows where status_column contains ‘NA’ but not ‘NON_NA’.
- Rows where status_column contains ‘NON_NA’.
- Rows where status_column contains both ‘NA’ and ‘NON_NA’.
It’s important to note that I want to exclude any rows where status_column may be NULL or an empty string, even when I’m looking for both ‘NA’ and ‘NON_NA’.
I’ve tried using the FIND_IN_SET() function in MySQL, which requires me to use two separate conditions for the third case (matching both ‘NA’ and ‘NON_NA’). Here are my current queries:
For condition 1:
SELECT * FROM my_table
WHERE FIND_IN_SET('NA', status_column) > 0;
For condition 2:
SELECT * FROM my_table
WHERE FIND_IN_SET('NON_NA', status_column) > 0;
For condition 3:
SELECT * FROM my_table
WHERE FIND_IN_SET('NA', status_column) > 0
AND FIND_IN_SET('NON_NA', status_column) > 0;
These queries work, but I am wondering if there is a more optimized way to achieve the same results, especially for the third condition where I am currently using two FIND_IN_SET() calls.
I also considered using the REGEXP function as follows:
SELECT * FROM my_table
WHERE status_column REGEXP '(^|,)NA(,|$)|(,|^)NON_NA(,|$)';
However, I’m unsure if this is the most efficient approach or if there are any better alternatives that I might be missing.
Could someone suggest a more optimal query or confirm if the REGEXP approach is the best way to handle this in MySQL?
2
Answers
Rows where status_column contains ‘NA’ but not ‘NON_NA’:
SELECT *
FROM your_table
WHERE FIND_IN_SET(‘NA’, status_column) > 0
AND FIND_IN_SET(‘NON_NA’, status_column) = 0
AND status_column IS NOT NULL
AND status_column != ”;
Rows where status_column contains ‘NON_NA’:
SELECT *
FROM your_table
WHERE FIND_IN_SET(‘NON_NA’, status_column) > 0
AND status_column IS NOT NULL
AND status_column != ”;
Rows where status_column contains both ‘NA’ and ‘NON_NA’:
SELECT *
FROM your_table
WHERE FIND_IN_SET(‘NA’, status_column) > 0
AND FIND_IN_SET(‘NON_NA’, status_column) > 0
AND status_column IS NOT NULL
AND status_column != ”;
You can use index on expression.
This expression can produce bit mask for values ‘NA’,’NON_NA’,’PSG_REQUIRED’
Search (WHERE FIND_IN_SET(‘NA’, status_column) > 0) condition my convert to
IN(....)
.If you carefully write out the expressions when creating the index and search conditions, the query will use the
Index range scan
.See example
Test data
Query
Execute plan