I always come across these kinds of SQL problems and find it confusing what to do. For example, suppose there is a table of : names, year, food.
`CREATE TABLE t (
id SERIAL PRIMARY KEY,
name VARCHAR(50),
food VARCHAR(50)
);
INSERT INTO t (name, food) VALUES
('john', 'pizza'),
('john', 'cake'),
('andrew', 'pizza'),
('andrew', 'pizza'),
('andrew', 'pizza'),
('matt', 'pizza'),
('matt', 'pizza'),
('matt', 'burger'),
('david', 'cake'),
('david', 'pizza'),
('david', 'pizza'),
('elen', 'cake'),
('elen', 'pizza'),
('elen', 'donuts'),
('claire', 'cake'),
('claire', 'donuts'),
('claire', 'tacos'),
('john', 'pizza'),
('john', 'cake'),
('matt', 'apples'),
('matt', 'tacos');
`
If a name has some rows that meet a certain condition, I want to select all rows and all columns for that name (even the rows that for that name that don’t directly match). For example:
- Problem 1 Select all rows/columns for names that ONLY like pizza.
Types of rows that would be selected:
john : pizza ;
andrew: pizza, pizza, pizza
Types of rows that would not be selected:
matt: pizza, pizza, burger
- Problem 2: Select all rows/columns for names that ONLY like pizza and cake
Types of rows that would be selected:
john: cake, pizza ;
david: cake, pizza, pizza
Types of rows that would not be selected:
elen : cake, pizza, donuts
- Problem 3: Select all rows/columns for names that DONT like pizza at all
Types of rows that would be selected:
claire: cake, donuts, tacos
Types of rows that would be not be selected:
john: pizza, cake
- Problem 4: Select all rows/columns for names that DONT like pizza and cake at all
Types of rows that would be selected:
matt : apples, tacos
Types of rows that would not be selected:
john : cake, apples, donuts;
elen: pizza, chocolate;
tyler: pizza, cake;
I never how to write the query for these problems.
For example:
Problem 1:
SELECT *
FROM t
WHERE name IN (
SELECT name
FROM t
WHERE food IN ('pizza')
GROUP BY name
HAVING COUNT(DISTINCT food) = 1 AND MIN(CASE WHEN food NOT IN ('pizza') THEN 1 ELSE 0 END) = 0
);
Problem 2:
SELECT *
FROM t
WHERE name IN (
SELECT name
FROM t
WHERE food IN ('pizza', 'cake')
GROUP BY name
HAVING COUNT(DISTINCT food) <= 2 AND MIN(CASE WHEN food NOT IN ('pizza', 'cake') THEN 1 ELSE 0 END) = 0
);
Problem 3:
SELECT *
FROM t
WHERE name NOT IN (
SELECT name
FROM t
WHERE food IN ('pizza')
);
Problem 4:
SELECT *
FROM t
WHERE name NOT IN (
SELECT name
FROM t
WHERE food IN ('pizza', 'cake')
);
I get so confused when writing these SQL statements, thinking about which conditions are needed to correctly identify the logical conditions. Is there a way to simplify the logic for these types of queries?
2
Answers
I think queries 3 and 4 are perfect like that, and what you need to do in 1 and 2 is try to think more in terms of sets, like in 3 and 4, instead of trying to find conditions based on counts of rows.
Here’s how I would do them
Problem 1:
Basically select all the names with food pizza, and that are not included in the set of names with any other food. If I understood you correctly we should only get andrew here (because jonh also liked cake)
Problem 2:
Select all names that are included in the set of names that liked pizza and/or cake, and that are not included in the set of names that liked something different than cake/pizza
Be careful in problem 2, the query above is for names that liked pizza AND/OR cake, and nothing else. If you wanted to get names that liked pizza AND cake and nothing else (eliminating andrew, who liked pizza but not cake), then you would have to add the condition that both pizza and cake should exist in the foods:
If you are not used to the exists/not exists operators you can easily change the queries to use in/not in instead:
Problem 1:
Problem 2:
If you create a cte that lists food per name – you can get answers to all 4 questions simply by changing where conditions.
The code for 1st question would be like below…
… and for all other questions just change the WHERE clause …