skip to Main Content

Write an SQL query to report the patient_id, patient_name all conditions of patients who have Type I Diabetes. Type I Diabetes always starts with DIAB1 prefix.

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| patient_id   | int     |
| patient_name | varchar |
| conditions   | varchar |
+--------------+---------+

This table contains information of the patients in the hospital.

patient_id is the primary key for this table. conditions contains 0 or more code separated by spaces.

So this was my solution:

SELECT *
FROM Patients
WHERE conditions LIKE 'DIAB1%' OR conditions LIKE '%DIAB1%' ;

It worked correctly for all these conditions

patient_id patient_name conditions
1 Daniel YFEV COUGH
2 Alice
3 Bob DIAB100 MYOP
4 George ACNE DIAB100

except for this condition

patient_id patient_name conditions
1 Daniel SADIAB100

And in the solution it was shown that there is a space after 1st % which would give you the correct answer:

correct query:

SELECT *
FROM Patients
WHERE conditions LIKE 'DIAB1%' OR conditions LIKE '% DIAB1%' ;

So, can someone please explain why this query works for that particular condition (SADIAB100) and not the 1st query

2

Answers


    • creating a regex pattern to find the keyword ‘DIAB1’
    • filtering on cases where it matches the above
    with main as (
    select
    *, REGEXP_LIKE(conditions,'DIAB1') as is_relevant_patient
    from <table_name>
    )
    select * from main where is_relevant_patient
    
    Login or Signup to reply.
  1. WHERE conditions LIKE ‘DIAB1%’ OR conditions LIKE ‘% DIAB1%’

    The problem this is trying to address is when a condition contains the keyword (DIAB1) – while you only want to match on the beginning of the keyword.

    The naive approach fails, because it matches on "SADIAB100":

    WHERE conditions LIKE '%DIAB1%'
    

    So the workaround is to search for the keyword:

    • either at the beginning of the whole string (ie at the beginning of the first condition) ; that’s what LIKE 'DIAB1%' does
    • or after another condition, in which case it is preceded by a space, so ' DIAB1%'

    Hence:

    WHERE conditions LIKE 'DIAB1%' OR conditions LIKE '% DIAB1%'
    

    A slightly neater expression is:

    WHERE CONCAT(' ', conditions) LIKE '% DIAB1%'
    

    Bottom line: if you are using a relational database, you should not be storing multiple values in a single row.

    Instead of a CSV-like format, you should have a separate table to store the conditions, with each value on a separate row, allowing you to leverage the powerful set-based features that your product offers.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search