Group multiple events for ID in postgreSQL data

j1n3l0
September 10, 2024
85 views
0 votes
2 Answers

I’m trying to select pairs of events from my data. I have a data that looks like this:

create table test(id,status,created)as values
 (1, 'on','2024-09-01'::date)
,(2, 'on','2024-09-02'::date)
,(1,'off','2024-09-03'::date)
,(1, 'on','2024-09-04'::date)
,(2,'off','2024-09-05'::date)
,(3, 'on','2024-09-06'::date)
,(1,'off','2024-09-07'::date)
,(4,'off','2024-09-08'::date);

I would like to have data like this:

id	switched_on	switched_off
1	2024-09-01	2024-09-03
1	2024-09-04	2024-09-07
2	2024-09-02	2024-09-05
3	2024-09-06
4		2024-09-08

I am interested in:

multiple events for the same id in the data
null values when either end of event pair is missing

Tags: postgresql sql

Answers

- Zegarek
- September 10, 2024 at 2:31 pm
- 0 votes
0
An example using the self-join:
1. Match on id
2. Make sure the left table is for earlier on events, right for later off.
3. This matches every on event with all later offs for each id. A distinct on keeps only a single match for each id and on event, selecting the soonest off.
_{demo at db<>fiddle}
```
select distinct on (1,2)
       coalesce(t1.id,t2.id)
      ,t1.created as switched_on 
      ,t2.created as switched_off
from test as t1 
full join test as t2 
       on t1.id=t2.id 
      and t1.created<t2.created
      and t1.status='on'
      and t2.status='off'
where coalesce(t1.status,'on')='on' 
  and coalesce(t2.status,'off')='off'
order by 1,2, t2.created-t1.created;
```
coalesce switched_on switched_off

1 2024-09-01 2024-09-03

1 2024-09-04 2024-09-07

2 2024-09-02 2024-09-05

3 2024-09-06 null

4 null 2024-09-08
Login or Signup to reply.

- ThorstenKettner
- September 10, 2024 at 4:47 pm
- 0 votes
0
This is no easy task. What I do below is to build groups of up to two rows using the internal row ID called CTID in PostgreSQL. All rows use their own CTID for a group key, except for ‘off’ rows following immediately an ‘on’ row. These get the ‘on’ rows’ CTIDs instead, thus coupling with them.

In the next step I aggregate by those group keys showing the switch_on and switch_off values. At last PostgreSQL forces me to wrap this again in a select, because it is unable to apply COALESCE on the selected alias names, which I consider a bug.
```
select *
from
(
  select
    id,
    min(created) filter (where status = 'on') as switch_on,
    min(created) filter (where status = 'off') as switch_off
  from
  (
    select 
      id,
      status,
      created,
      case when status = 'off' and lag(status) over w = 'on'
        then lag(ctid) over w
        else ctid
        end as grp
    from test
    window w as (partition by id order by created, status desc)
  ) grouped
  group by grp, id
) result
order by id, coalesce(switch_on, switch_off);
```
Demo: https://dbfiddle.uk/KqJoUw_O
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.