create table hall_events
(
hall_id integer,
start_date date,
end_date date
);
delete from hall_events;
insert into hall_events values
(1,'2023-01-13','2023-01-14')
,(1,'2023-01-14','2023-01-17')
,(1,'2023-01-15','2023-01-17')
,(1,'2023-01-18','2023-01-25')
,(2,'2022-12-09','2022-12-23')
,(2,'2022-12-13','2022-12-17')
,(3,'2022-12-01','2023-01-30');
select * from hall_events;
This is my table
**my output shoud be **
hall_id start_date end_date
1 2023-01-13 2023-01-17
1 2023-01-18 2023-01-25
2 2022-12-09 2022-12-17
3 2022-12-01 2023-01-30
here is my query
WITH cte as
(select *,row_number()over(order by hall_id,start_date) as event_id
from(
with recursive r_cte as(select *,1 as flag from cte where event_id=1
union all
select cte.hall_id,cte.start_date,cte.end_date,cte.event_id,
case when cte.hall_id=r_cte.hall_id and
(cte.start_date between r_cte.start_date and r_cte.end_date or
r_cte.start_date between cte.start_date and cte.end_date)
then 0 else 1 end +flag
from r_cte
inner join cte on r_cte.event_id+1=cte.event_id)
select * from cte)x)
2
Answers
This is a "gaps and islands" problem where overlapping dates need to be treated as an "island". This first CTE identifies the start of each "island" of continuous dates per hall_id using the LAG function. The second CTE supplies a "group number" (i.e. to identify all rows of each island) by using a running total of the start_flag column. Then it groups by hall_id and the group number to find the minimum start_date and maximum end_date for each island.
*note the end date for id 2 is greater than shown in the wanted result, but 2022-12-23 is the maximum date for id 2 in the data.
fiddle
If you have trouble understanding how this works, just output the result of CTE2
and here is an alternative way of writing the query without CTE’s:
see: https://dbfiddle.uk/_R0AXK2V
The key gotcha is IF the new start_date should be less than or equal to the previous end_date, then it should be grouped together.