I have these tables:
Project
id | name | version |
---|---|---|
1 | Swam | 0.0.1 |
2 | Dinali | 0.0.1 |
3 | Dinali | 0.0.2 |
4 | BigR | 0.0.3 |
5 | Kale | 0.0.1 |
6 | Kale | 0.0.2 |
Person
id | name |
---|---|
1 | Jake |
2 | Skye |
3 | Kieth |
4 | Jim |
5 | Eliz |
6 | Haun |
Person_Project
id | person_id | project_id |
---|---|---|
1 | 1 | 1 |
2 | 2 | 1 |
3 | 2 | 2 |
4 | 3 | 1 |
5 | 3 | 2 |
6 | 4 | 1 |
7 | 4 | 4 |
8 | 5 | 1 |
9 | 6 | 1 |
10 | 6 | 2 |
11 | 6 | 3 |
I want to find all the persons who are working on same projects exact match. From the above data, result should be persons 1 and 5 because they both working on project 1, persons 2 and 3 because they are working on same projects 1 and 2.
Should not return 4 and 6 as there are no other persons working exact projects they are working on.
2
Answers
Query:
OUtput:
fiddle
This can be optimized.
2x subquery, 1x window function
1x CTE, 1x
EXISTS
In addition,
array_agg()
should be faster thatstring_agg()
(also avoiding the cast).For large numbers of persons and/or projects, it should pay to create a temporary table instead of the CTE, add an index on
(hash_array(projects), person_id)
, and compare the hash values.For very large sets, use
hash_array_extended(projects, 0)
to practically rule out hash collisions.Related: