I have a representation of a (binary) tree as a table in PostgreSQL. Each row is a node of the tree and each row has a parent column and a color.
The nodes at he leaves of the tree have a color (their color column is not null and can be either green or red).
My goal is to color the whole tree. With these rules:
- If a parent has one child, then the parent’s color is the child’s color.
- If a parent has two children and both have the same color then the parent color is its children color.
- If a parent has two children and they have different color, then the parent color is gray.
How can I write a recursive query in PostgreSQL for this algorithm?
Example of a tree:
CREATE TEMPORARY TABLE family (
node text
, parent text
, color text
);
INSERT INTO family (parent, node, color)
VALUES
(null , 'A' , null)
, ('A' , 'AB' , null)
, ('A' , 'AC' , null)
, ('AB' , 'ABD' , null)
, ('ABD', 'ABDE', 'green')
, ('AC' , 'ACF' , null)
, ('AC' , 'ACG' , null)
, ('ACF', 'ACFH', 'red')
, ('ACF', 'ACFI', 'green')
, ('ACG', 'ACGJ', 'red')
, ('ACG', 'ACGK', 'red')
;
This tree, fully colored:
2
Answers
You can use a recursive
cte
, aggregating after you perform thejoin
on the parent node in a subquery:See fiddle.
A recursive CTE does not allow aggregation in the recursive term.
I suggest to update the original table one level at a time, top-down:
fiddle
If you don’t want to touch the original table, operate on a copy of the table, maybe a
TEMPORARY
table for best performance.I use
length(node)
as indicator of the nesting level. If that’s unreliable, replace it with a reliable nesting level. Either way, if the table is big and there are many levels, an index on the level indicator will help performance. For the demo that would be:The added
WHERE
clausep.color IS DISTINCT FROM c.color
skips empty updates. See:Related: