I’m trying to select pairs of events from my data. I have a data that looks like this:
create table test(id,status,created)as values
(1, 'on','2024-09-01'::date)
,(2, 'on','2024-09-02'::date)
,(1,'off','2024-09-03'::date)
,(1, 'on','2024-09-04'::date)
,(2,'off','2024-09-05'::date)
,(3, 'on','2024-09-06'::date)
,(1,'off','2024-09-07'::date)
,(4,'off','2024-09-08'::date);
I would like to have data like this:
id | switched_on | switched_off |
---|---|---|
1 | 2024-09-01 | 2024-09-03 |
1 | 2024-09-04 | 2024-09-07 |
2 | 2024-09-02 | 2024-09-05 |
3 | 2024-09-06 | |
4 | 2024-09-08 |
I am interested in:
- multiple events for the same
id
in the data null
values when either end of event pair is missing
2
Answers
An example using the self-join:
id
on
events, right for lateroff
.on
event with all lateroff
s for eachid
. Adistinct on
keeps only a single match for eachid
andon
event, selecting the soonestoff
.demo at db<>fiddle
This is no easy task. What I do below is to build groups of up to two rows using the internal row ID called
CTID
in PostgreSQL. All rows use their own CTID for a group key, except for ‘off’ rows following immediately an ‘on’ row. These get the ‘on’ rows’ CTIDs instead, thus coupling with them.In the next step I aggregate by those group keys showing the switch_on and switch_off values. At last PostgreSQL forces me to wrap this again in a select, because it is unable to apply
COALESCE
on the selected alias names, which I consider a bug.Demo: https://dbfiddle.uk/KqJoUw_O