Given the following table1 schema:
+---------+-----------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+-----------------+------+-----+---------+----------------+
| ID | int | NO | PRI | NULL | auto_increment |
| NAME | bigint unsigned | NO | MUL | NULL | |
| SECONDS | int | NO | MUL | NULL | |
| LINK | int | YES | | NULL | |
+---------+-----------------+------+-----+---------+----------------+
and the following data in table1
+----+---------+---------+---------+
| ID | NAME | SECONDS | LINK |
+----+---------+---------+---------+
| 1 | 1 | 1 | 1 |
| 2 | 2 | 1 | 1 |
| 3 | 3 | 1 | 2 |
| 4 | 4 | 1 | 2 |
| 5 | 2 | 2 | 1 |
| 6 | 3 | 2 | 1 |
| 7 | 4 | 2 | 2 |
| 8 | 1 | 2 | 3 |
| 9 | 3 | 3 | 1 |
| 10 | 4 | 3 | 1 |
| 11 | 1 | 3 | 2 |
+----+---------+---------+---------+
return the following result
+---------+---------+---------+
| NAME | SECONDS | LINK |
+---------+---------+---------+
| 1 | 1 | 1 |
| 1 | 2 | 3 |
| 1 | 3 | 2 |
| 2 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 1 | 2 |
| 3 | 2 | 1 |
| 3 | 3 | 1 |
+---------+---------+---------+
The query should group all NAME
s that share the same SECONDS
and LINK
, for example
SELECT
GROUP_CONCAT(DISTINCT NAME) AS NAME_list,
SECONDS,
LINK
FROM
table1
GROUP BY
SECONDS,
LINK
HAVING
COUNT(DISTINCT NAME) > 1;
returns
+--------------+---------+---------+
| NAME_list | SECONDS | LINK |
+--------------+---------+---------+
| 1,2 | 1 | 1 |
| 3,4 | 1 | 2 |
| 2,3 | 2 | 1 |
| 3,4 | 3 | 1 |
+--------------+---------+---------+
and then given an input list of NAME
s, say (1)
, for each group above where 1
appears, in our case 1,2
, take the remaining NAME
s from the group, such as 2
, and find all NAME
s where 2
appears in some other group.
Essentially the problem is to obtain transitive relationships two levels deep, that is since 1 connects to 2, and 2 connects to 3, we need to return NAME
s 1, 2, and 3. However we do not want to go another level deeper, that is, we are after
1 -> 2 -> 3 (two links)
but not
1 -> 2 -> 3 -> 4 (we don't need to return also link to 4)
even though based on the group 2,3 we know that 3 also connects to 4 as there exits a group 3,4.
In other words, given an input list of NAME
s, for each NAME
, find all rows that are in transitive relationships two links deep A -> B -> C based on shared columns.
If possible how can this query be expressed in SQL?
2
Answers
Your sample data is a little too small to consider this interesting task.
But let’s try. Take a look at the example.
Result is
To explain the idea of the solution, I will give the intermediate results of the queries
Subquery
l2
result – NAMEs that share the same SECONDS and LINK (Very similar to the result of your query)I’ll add some test data
There result is
My approach would be to build this up using self-joins.
Step 1: Prepare a table of which rows have different
NAME
but the sameSECONDS
andLINK
(A => B):Results:
Step 2: Then self-join this table to get two steps (A => B => C):
Results:
Step 3: Then join once more to the original table to get any rows that match any of the transition steps:
Results:
This isn’t quite your desired result set (two rows are different) but I think is more accurate according to your description.