skip to Main Content
aid bid
1 2
1 3
2 3
3 4
5 6
7 8
8 10
8 9

I would like to extarct the hierarchy linkage for aid and bid.
No need to show each path, any id should show in one row only.

Expected Output:

path
1,2,3,4
5,6
7,8,9,10

I tried to use recursive function as below.

WITH RECURSIVE node_tree AS (
  SELECT aid, bid, CAST(aid AS TEXT) || ',' || CAST(bid AS TEXT) AS path
  FROM tmp
  WHERE aid NOT IN (SELECT bid FROM tmp)
  UNION ALL
  SELECT nt.aid, h.bid, nt.path || ',' || h.bid
  FROM node_tree nt
  JOIN tmp h ON nt.bid = h.aid
)
SELECT DISTINCT path
FROM node_tree
ORDER BY path;

It returns below.

path
1,2
1,2,3
1,2,3,4
1,3
1,3,4
5,6
7,8
7,8,10
7,8,9

Seems it cannot handle one Id only show in one row:

  • only need to show 1,2,3,4 and don’t want to show below.
path
1,2
1,2,3
1,3
1,3,4
  • should show 7,8,9,10 instead of below:
path
7,8
7,8,10
7,8,9

2

Answers


  1. There are several approaches using recursive queries to achieve the desired results. The following uses arrays to build the paths and keep track of the new nodes discovered with each iteration.

    WITH RECURSIVE
      t(aid, bid) AS (VALUES (1, 2),
                             (1, 3),
                             (2, 3),
                             (3, 4),
                             (5, 6),
                             (7, 8),
                             (8, 10),
                             (8, 9)),
      cte AS (SELECT DISTINCT 0 AS depth, ARRAY [t.aid] AS path, ARRAY [t.aid] AS new_nodes
                FROM t
                WHERE t.aid NOT IN (SELECT children.bid FROM t children)
              UNION ALL
              SELECT depth + 1 AS depth, cte.path || c.new_nodes AS path, c.new_nodes
                FROM cte
                  CROSS JOIN LATERAL (SELECT ARRAY_AGG(DISTINCT t.bid ORDER BY t.bid) AS new_nodes
                                        FROM t
                                        WHERE t.aid = ANY (cte.new_nodes)
                                          AND NOT t.bid = ANY (cte.path)) c
                WHERE c.new_nodes IS NOT NULL)
    SELECT DISTINCT ON (cte.path[1]) (REGEXP_MATCH(cte.path::text, '[^{}]+'))[1] AS path
      FROM cte
      ORDER BY cte.path[1], depth DESC;
    

    The recursive query begins by identifying the nodes without parents. These nodes establish the path roots and are also memoized as newly visited nodes. At each iteration, nodes that are children of the nodes visited in the prior iteration, but that have not yet been visited, are added to the path and memoized as newly visited nodes. This process continues until all nodes have been visited.

    The final part of the query uses DISTINCT ON with rows sorted in ascending order by the path root and descending by depth to return only the full path from each root node. The path array is converted to text and REGEXP_MATCH is used to remove the enclosing braces.

    Login or Signup to reply.
  2. Your attempt is very good, although it’s missing something. I’d avoid to generate the string already in the recursion: probably better to extract each child for all anchestors, by recording the index of each children. And then apply aggregation of the children, ordered by index, and concatenated to their own anchestor.

    WITH RECURSIVE ordered_tab AS (
        SELECT * FROM tab ORDER BY aid, bid
    ), cte AS (
        SELECT aid, MIN(bid) AS bid, 0 AS idx 
        FROM tab t1
        WHERE NOT EXISTS(SELECT 1 FROM tab t2 WHERE t1.aid = t2.bid)
        GROUP BY aid
      
        UNION ALL 
      
        SELECT cte.aid, tab.bid, idx + 1
        FROM       cte
        INNER JOIN ordered_tab tab
                ON cte.bid = tab.aid
    )
    SELECT CONCAT(aid, ',', STRING_AGG(bid::VARCHAR, ',' ORDER BY idx)) AS path
    FROM cte
    GROUP BY aid
    

    Output:

    path
    1,2,3,4
    5,6
    7,8,9,10

    Check the demo here.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search