skip to Main Content

I’m trying to select pairs of events from my data. I have a data that looks like this:

create table test(id,status,created)as values
 (1, 'on','2024-09-01'::date)
,(2, 'on','2024-09-02'::date)
,(1,'off','2024-09-03'::date)
,(1, 'on','2024-09-04'::date)
,(2,'off','2024-09-05'::date)
,(3, 'on','2024-09-06'::date)
,(1,'off','2024-09-07'::date)
,(4,'off','2024-09-08'::date);

I would like to have data like this:

id switched_on switched_off
1 2024-09-01 2024-09-03
1 2024-09-04 2024-09-07
2 2024-09-02 2024-09-05
3 2024-09-06
4 2024-09-08

I am interested in:

  1. multiple events for the same id in the data
  2. null values when either end of event pair is missing

2

Answers


  1. An example using the self-join:

    1. Match on id
    2. Make sure the left table is for earlier on events, right for later off.
    3. This matches every on event with all later offs for each id. A distinct on keeps only a single match for each id and on event, selecting the soonest off.

    demo at db<>fiddle

    select distinct on (1,2)
           coalesce(t1.id,t2.id)
          ,t1.created as switched_on 
          ,t2.created as switched_off
    from test as t1 
    full join test as t2 
           on t1.id=t2.id 
          and t1.created<t2.created
          and t1.status='on'
          and t2.status='off'
    where coalesce(t1.status,'on')='on' 
      and coalesce(t2.status,'off')='off'
    order by 1,2, t2.created-t1.created;
    
    coalesce switched_on switched_off
    1 2024-09-01 2024-09-03
    1 2024-09-04 2024-09-07
    2 2024-09-02 2024-09-05
    3 2024-09-06 null
    4 null 2024-09-08
    Login or Signup to reply.
  2. This is no easy task. What I do below is to build groups of up to two rows using the internal row ID called CTID in PostgreSQL. All rows use their own CTID for a group key, except for ‘off’ rows following immediately an ‘on’ row. These get the ‘on’ rows’ CTIDs instead, thus coupling with them.

    In the next step I aggregate by those group keys showing the switch_on and switch_off values. At last PostgreSQL forces me to wrap this again in a select, because it is unable to apply COALESCE on the selected alias names, which I consider a bug.

    select *
    from
    (
      select
        id,
        min(created) filter (where status = 'on') as switch_on,
        min(created) filter (where status = 'off') as switch_off
      from
      (
        select 
          id,
          status,
          created,
          case when status = 'off' and lag(status) over w = 'on'
            then lag(ctid) over w
            else ctid
            end as grp
        from test
        window w as (partition by id order by created, status desc)
      ) grouped
      group by grp, id
    ) result
    order by id, coalesce(switch_on, switch_off);
    

    Demo: https://dbfiddle.uk/KqJoUw_O

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search