I need to find supersets within an association table.
You can consider bag
to be a collection of item
records. The relationship between the two is tracked in bag_item
.
I need to know which bag
s are supersets of other bag
s: containing ALL the item
s and possibly one or more additional items.
bag
id | name |
---|---|
1 | Bag A |
2 | Bag B |
3 | Bag C |
4 | Bag D |
item
id | name |
---|---|
1 | Pencil |
2 | Pen |
3 | Eraser |
4 | Marker |
bag_item
(association table)
id | bag_id | item_id |
---|---|---|
1 | 1 | 1 |
2 | 1 | 2 |
3 | 2 | 1 |
4 | 2 | 2 |
5 | 2 | 3 |
6 | 2 | 4 |
7 | 3 | 1 |
8 | 3 | 4 |
9 | 4 | 1 |
10 | 4 | 2 |
11 | 4 | 3 |
Database: MySQL 8
I have "base" bag
identifiers and I want to get "superset" bag
identifiers that contain ALL the items in my "base" bag (and more items if present).
If I query for ALL bags, the response should look something like this:
base_bag_id | superset_bag_id |
---|---|
1 | 2 |
1 | 4 |
3 | 2 |
4 | 2 |
But, I also want to be able to filter on the basis of the base_bag_id
, e.g. 1 & 4.
I’ve tried something to the effect of the following:
select distinct base_bag_item.bag_id, superset_bag_item.bag_id
from bag_item as base_bag_item
join bag_item as superset_bag_item on superset_bag_item.item_id = base_bag_item.item_id
where superset_bag_item.bag_id != base_bag_item.bag_id
and base_bag_item.bag_id in (1, 4)
order by base_bag_item.bag_id, superset_bag_item.bag_id
However, this query will incorrectly produce bags that contain at least one of the quotes from the "base", e.g. base_bag_id
1 will have superset_bag_id
3 because both bags contain item_id
1.
How do I get my desired output?
3
Answers
Try this solution please:
This is classic Relational Division With Remainder, with the multiple divisors tweak (you want to get all bags for all other possible bags, not just for one bag).
There are a number of solutions. A simple one, if rather inefficient, is a double
NOT EXISTS
.A more efficient solution is to join everything up using a left-join, then group and check the count for missing values
db<>fiddle
Assuming that this means "find a bug which contains all items for each specified bag":
fiddle
If the bags in the list must be not included into the superset bags list (bag #4 in the example) then add
WHERE NOT FIND_IN_SET(bag_id, @base_bags)
to CTEbags_list
.