I have a table of changes like this
CREATE TABLE IF NOT EXISTS changes (
entity_id TEXT NOT NULL,
column_id TEXT NOT NULL,
value JSONB NOT NULL,
updated_at TIMESTAMP NOT NULL
);
INSERT INTO changes VALUES ('1', 'height', to_jsonb(140), '01-01-2021 00:00:00'::TIMESTAMP);
INSERT INTO changes VALUES ('1', 'weight', to_jsonb(30), '01-01-2021 00:00:00'::TIMESTAMP);
INSERT INTO changes VALUES ('1', 'height', to_jsonb(145), '01-02-2021 00:00:00'::TIMESTAMP);
INSERT INTO changes VALUES ('1', 'weight', to_jsonb(34),'01-03-2021 00:00:00'::TIMESTAMP);
entity_id | column_id | value | updated_at
-----------+-----------+-------+---------------------
1 | height | 140 | 2021-01-01 00:00:00
1 | weight | 30 | 2021-01-01 00:00:00
1 | height | 145 | 2021-01-02 00:00:00
1 | weight | 34 | 2021-01-03 00:00:00
And I want to get kinda cumulative view of this table
entity_id | height | weight | updated_at
-----------+--------+--------+---------------------
1 | 140 | 30 | 2021-01-01 00:00:00
1 | 145 | 30 | 2021-01-02 00:00:00
1 | 145 | 34 | 2021-01-03 00:00:00
My current query looks working
SELECT
entity_id,
coalesce(change->'height', lag(change->'height', 1, null) over (partition by entity_id order by updated_at)) as height,
coalesce(change->'weight', lag(change->'weight', 1, null) over (partition by entity_id order by updated_at)) as weight,
updated_at
FROM (
SELECT entity_id, json_object_agg(column_id, value) as change, updated_at FROM changes
GROUP BY entity_id, updated_at
) as changes;
But I don’t like json_object_agg
here and I’m sure there is a way do it without redundant aggregations? some kind of using window aggregate function that I’ve missed.
UPD. @SelVazi helped to make query better, but I feel it’s not the final solution.
with cte as (
SELECT
entity_id,
max(case when column_id = 'height' then value::int end) as height,
max(case when column_id = 'weight' then value::int end) as weight,
updated_at
from changes
GROUP by entity_id, updated_at
)
select
entity_id,
coalesce(height, lag(height) over (partition by entity_id order by updated_at)) as height,
coalesce(weight, lag(weight) over (partition by entity_id order by updated_at)) as weight,
updated_at
from cte;
2
Answers
Instead of using
json_object_agg
you can use the conditional aggregation to get hight and width as columns :The final query :
Result :
This is more complicated that it may seem. Pivoting the height and weight to columns can be done with conditional aggregation, but then we have to fill the "missing" values.
I would assume that there could be gaps of more than one date for any of the two measures, which makes
lag()
the wrong fit in Postgres, since it can only look back over a predefined number of rows (and cannot ignorenull
values).We can demonstrate the problem with
lag()
by adding just one row at the end of your sample data:One workaround uses a gaps-and-islands technique to put "missing" values in groups that start with a non-
null
value, which then becomes the new value.fiddle