skip to Main Content
create table hall_events
(
hall_id integer,
start_date date,
end_date date
);

delete from hall_events;

insert into hall_events values 
(1,'2023-01-13','2023-01-14')
,(1,'2023-01-14','2023-01-17')
,(1,'2023-01-15','2023-01-17')
,(1,'2023-01-18','2023-01-25')
,(2,'2022-12-09','2022-12-23')
,(2,'2022-12-13','2022-12-17')
,(3,'2022-12-01','2023-01-30');

select * from hall_events;

This is my table

**my output shoud be **

hall_id start_date end_date
1       2023-01-13 2023-01-17
1       2023-01-18 2023-01-25
2       2022-12-09 2022-12-17
3       2022-12-01 2023-01-30

here is my query

WITH cte as
(select *,row_number()over(order by hall_id,start_date) as event_id
from(
with recursive r_cte as(select *,1 as flag from cte where event_id=1
              union all
              select cte.hall_id,cte.start_date,cte.end_date,cte.event_id,
              case when cte.hall_id=r_cte.hall_id and 
              (cte.start_date between r_cte.start_date and r_cte.end_date or
              r_cte.start_date between cte.start_date and cte.end_date) 
              then 0 else 1 end +flag
              from r_cte
              inner join cte on r_cte.event_id+1=cte.event_id)
select * from cte)x)

2

Answers


  1. This is a "gaps and islands" problem where overlapping dates need to be treated as an "island". This first CTE identifies the start of each "island" of continuous dates per hall_id using the LAG function. The second CTE supplies a "group number" (i.e. to identify all rows of each island) by using a running total of the start_flag column. Then it groups by hall_id and the group number to find the minimum start_date and maximum end_date for each island.

    WITH CTE AS (
        SELECT hall_id,
               start_date,
               end_date,
               CASE
                   WHEN start_date <= LAG(end_date) OVER (PARTITION BY hall_id ORDER BY start_date) THEN 0
                   ELSE 1
               END AS start_flag
        FROM hall_events
    ),
    CTE2 AS (
        SELECT hall_id,
               start_date,
               end_date,
               SUM(start_flag) OVER (PARTITION BY hall_id ORDER BY start_date) AS grp
        FROM CTE
    )
    SELECT hall_id,
           MIN(start_date) AS start_date,
           MAX(end_date) AS end_date
    FROM CTE2
    GROUP BY hall_id, grp
    ORDER BY hall_id, start_date;
    
    hall_id start_date end_date
    1 2023-01-13 2023-01-17
    1 2023-01-18 2023-01-25
    2 2022-12-09 2022-12-23*
    3 2022-12-01 2023-01-30

    *note the end date for id 2 is greater than shown in the wanted result, but 2022-12-23 is the maximum date for id 2 in the data.

    fiddle

    If you have trouble understanding how this works, just output the result of CTE2

    hall_id start_date  end_date    grp
    1   2023-01-13  2023-01-14  1
    1   2023-01-14  2023-01-17  1
    1   2023-01-15  2023-01-17  1
    1   2023-01-18  2023-01-25  2
    2   2022-12-09  2022-12-23  1
    2   2022-12-13  2022-12-17  1
    3   2022-12-01  2023-01-30  1
    

    and here is an alternative way of writing the query without CTE’s:

    SELECT hall_id, MIN(start_date) AS start_date, MAX(end_date) AS end_date
    FROM (
      SELECT hall_id, start_date, end_date,
             SUM(flag) OVER (PARTITION BY hall_id ORDER BY start_date) AS grp
      FROM (
        SELECT hall_id, start_date, end_date,
               CASE WHEN start_date <= LAG(end_date) OVER (PARTITION BY hall_id ORDER BY start_date) THEN 0 ELSE 1 END AS flag
        FROM hall_events
      ) t
    ) t
    GROUP BY hall_id, grp
    ORDER BY hall_id, start_date;
    

    see: https://dbfiddle.uk/_R0AXK2V

    Login or Signup to reply.
  2. The key gotcha is IF the new start_date should be less than or equal to the previous end_date, then it should be grouped together.

    WITH cte AS (
        SELECT
            hall_id,
            start_date,
            end_date,
            start_date <= lag(end_date, 1, 'infinity') OVER (PARTITION BY hall_id ORDER BY start_date)
            OR FALSE AS step
        FROM
            hall_events AS h1
    ),
    cte1 AS (
        SELECT
            *,
            min(start_date) OVER (PARTITION BY hall_id,
                step),
            max(end_date) OVER (PARTITION BY hall_id,
                step)
        FROM cte
    )
    SELECT DISTINCT
        hall_id,
        min,
        max
    FROM
        cte1
    ORDER BY
        hall_id,
        min,
        max;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search