I have two relevant tables in MySQL that look something like this:
marker
table
id | SEASONCD | ITEMCD | PRICETYPECD |
---|---|---|---|
1 | foo | bar | baz |
2 | foo | bar | baz |
3 | foo | bar | baz |
4 | qux | bar | baz |
5 | qux | bar | baz |
6 | spam | eggs | ham |
seat_marker
table
id | MARKER_ID |
---|---|
1 | 1 |
2 | 1 |
3 | 1 |
4 | 5 |
5 | 6 |
I would like to select all of the rows in the marker
table that are duplicates, excluding the one that is most frequently referenced / joined by the seat_marker
table. In other words, I would like to list all duplicates excluding the "original", where "original" is defined as the row with the most occurrences in seat_marker.MARKER_ID
.
The result would look like this:
id | SEASONCD | ITEMCD | PRICETYPECD |
---|---|---|---|
2 | foo | bar | baz |
3 | foo | bar | baz |
5 | qux | bar | baz |
We can see this is the original data from the marker
table, but with three rows removed;
- id 1 was removed because it was the foo-bar-baz duplicate that occurred more frequently in
seat_marker
than the other foo-bar-baz rows. - id 4 was removed because it was a duplicate. Since the qux-bar-baz rows show up 0 times in
seat_marker
, I don’t have a preference if we remove id 4 or id 5. I chose id 4 at random. - id 6 was removed because it is not a duplicate
The closest I’ve been able to get is this, which lists all values that are a duplicate and the number of assignments it has. But does not exclude the "originals".
-- Select all duplicates, and list the number of times it's referenced in the seat_marker table
SELECT
n1.id, n1.SEASONCD, n1.ITEMCD, n1.PRICETYPECD, (
SELECT COUNT(1)
FROM seat_marker
WHERE MARKER_ID = n1.id
) active_assignments
FROM marker n1, marker n2
WHERE
n1.id <> n2.id AND
n1.SEASONCD = n2.SEASONCD AND
n1.ITEMCD = n2.ITEMCD AND
n1.PRICETYPECD = n2.PRICETYPECD
GROUP BY id, SEASONCD, ITEMCD, PRICETYPECD, active_assignments
ORDER BY SEASONCD, ITEMCD, PRICETYPECD, active_assignments DESC, id;
2
Answers
Based on the modifications you made in the question, you can use the following code:
The expected result included in your question appears to be incorrect. It should be displaying
4
instead of5
, as5
has one corresponding row inseat_marker
and is therefor the "row with the most occurrences".One way to do this is with COUNT and
ROW_NUMBER()
:Output:
Here’s a db<>fiddle.