I’m currently trying to find a way to return all rows in which there is at least one other row with the same values in any number of columns and unique values in any number of other columns.
I want a generic approach to this to be able to apply it to any scenario in which I want to get all rows that have another row with a matching id or multiple id columns but also has all distinct values for another column or other columns.
My current challenge is trying to get rows from a license db where each row is a license. I want to see if multiple users have two of the same kind of license active. Therefore, I want to get all rows that represent active licenses that have at least one other row with a matching email and license type, but different license ids. This is so I can tell if a user is being charged twice for the same type of license. Here is an example table called Licenses (assume there is a unique primary key):
id | type | uid | |
---|---|---|---|
76 | B | bob | 12 |
96 | C | bob | 12 |
1 | A | bob | 11 |
1 | A | bob | 11 |
3 | A | bob | 11 |
3 | A | bob | 12 |
90 | A | bob | 12 |
99 | A | joe | 14 |
5 | B | joe | 15 |
6 | B | joe | 15 |
7 | B | joe | 16 |
12 | A | pat | 23 |
13 | A | pat | 23 |
57 | A | ira | 47 |
57 | A | ira | 47 |
60 | A | ian | 99 |
This is what I want to get:
id | type | uid | |
---|---|---|---|
1 | A | bob | 11 |
3 | A | bob | 11 |
90 | A | bob | 12 |
5 | B | joe | 15 |
6 | B | joe | 15 |
7 | B | joe | 16 |
12 | A | pat | 23 |
13 | A | pat | 23 |
But I am getting this (notice the the extra rows for bob):
id | type | uid | |
---|---|---|---|
1 | A | bob | 11 |
1 | A | bob | 11 |
3 | A | bob | 11 |
3 | A | bob | 12 |
90 | A | bob | 12 |
5 | B | joe | 15 |
6 | B | joe | 15 |
7 | B | joe | 16 |
12 | A | pat | 23 |
13 | A | pat | 23 |
So, ira was able to be filtered out properly since all records for ira and type A have the id of 57. BUT, bob has 2 extra rows included since the id differs from at least one other id with the same email and type. How can I write a query that will not return these duplicate IDs? This is the query I tried:
SELECT * FROM Licenses t1
WHERE EXISTS (
SELECT 1 FROM Licenses t2
WHERE t1.email = t2.email
AND t1.type = t2.type
AND t1.id <> t2.id
) ORDER BY email;
How would I write another query that further filters on column x to make sure x is unique for all the entries as well?
Is there a generic way to approach a problem like this?
Thanks!
2
Answers
This should work:
if you want to change the columns that define duplicates then just add/remove them from the CTE SELECT/GROUP BY and amend the join condition
Try the following:
demo