Write an SQL query to report the patient_id, patient_name all conditions of patients who have Type I Diabetes. Type I Diabetes always starts with DIAB1 prefix.
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| patient_id | int |
| patient_name | varchar |
| conditions | varchar |
+--------------+---------+
This table contains information of the patients in the hospital.
patient_id
is the primary key for this table. conditions
contains 0 or more code separated by spaces.
So this was my solution:
SELECT *
FROM Patients
WHERE conditions LIKE 'DIAB1%' OR conditions LIKE '%DIAB1%' ;
It worked correctly for all these conditions
patient_id | patient_name | conditions |
---|---|---|
1 | Daniel | YFEV COUGH |
2 | Alice | |
3 | Bob | DIAB100 MYOP |
4 | George | ACNE DIAB100 |
except for this condition
patient_id | patient_name | conditions |
---|---|---|
1 | Daniel | SADIAB100 |
And in the solution it was shown that there is a space after 1st % which would give you the correct answer:
correct query:
SELECT *
FROM Patients
WHERE conditions LIKE 'DIAB1%' OR conditions LIKE '% DIAB1%' ;
So, can someone please explain why this query works for that particular condition (SADIAB100) and not the 1st query
2
Answers
The problem this is trying to address is when a condition contains the keyword (DIAB1) – while you only want to match on the beginning of the keyword.
The naive approach fails, because it matches on "SADIAB100":
So the workaround is to search for the keyword:
LIKE 'DIAB1%'
does' DIAB1%'
Hence:
A slightly neater expression is:
Bottom line: if you are using a relational database, you should not be storing multiple values in a single row.
Instead of a CSV-like format, you should have a separate table to store the conditions, with each value on a separate row, allowing you to leverage the powerful set-based features that your product offers.