I have a table which looks like below:
CREATE TABLE time_records (
id uuid NOT NULL,
employee_id uuid NOT NULL,
starttime timestampt NOT NULL,
endtime timestampt NOT NULL
)
There will be overlap in times between the records for the same employee_id:
id | employee_id | starttime | endtime |
---|---|---|---|
1 | 1 | ‘2023-09-01 07:00:00’ | ‘2023-09-01 09:15:00’ |
2 | 1 | ‘2023-09-01 07:00:00’ | ‘2023-09-01 15:00:00’ |
3 | 1 | ‘2023-09-01 07:00:00’ | ‘2023-09-01 15:00:00’ |
4 | 1 | ‘2023-09-01 14:00:00’ | ‘2023-09-01 15:00:00’ |
5 | 1 | ‘2023-09-01 23:45:00’ | ‘2023-09-01 23:59:00’ |
6 | 1 | ‘2023-09-01 23:45:00’ | ‘2023-09-01 23:59:00’ |
What I’m trying to do is get the time ranges within all of these times:
employee_id | starttime | endtime | ids |
---|---|---|---|
1 | ‘2023-09-01 07:00:00’ | ‘2023-09-01 15:00:00’ | [1,2,3,4] |
1 | ‘2023-09-01 23:45:00’ | ‘2023-09-01 23:29:00’ | [5,6] |
I can get this to work if there’s only one set of overlapping time within a day using max/min for the start and end times, but I can’t seem to make it work when there are multiple sets of overlapping time in a day:
select timea.employee_id,
min(timea.starttime) starttime,
max(timea.endtime) endtime,
array_agg(timea.id) ids
from time_records timea
inner join time_records timea2 on timea.employee_id = timea2.employee_id and
tsrange(timea2.starttime, timea2.endtime, '[]') &&
tsrange(timea.starttime, timea.endtime, '[]')
and timea.id != timea2.id
group by timea.employee_id;
With results:
employee_id | starttime | endtime | ids |
---|---|---|---|
1 | ‘2023-09-01 07:00:00’ | ‘2023-09-01 23:59:00’ | [1,2,3,4,5,6] |
2
Answers
Using a
cte
to produce the maximumendtime
for eachstarttime
, the largest overlapping intervals can then be found and the originaltime_records
table joined back onto it with an aggregation:See fiddle
Plain aggregation with
min()
andmax()
cannot solve this. Which rows eventually form a group only becomes evident after merging ranges.The aggregate function
range_agg()
makes the task a whole lot simpler. It was added with Postgres 14. Just computing merged ranges is very simple now:To also get an array of involved IDs, we need to do more. One way is to join back to the underlying table and then aggregate once more (groups are identified by the merged ranges now):
Another way with a
LATERAL
subquery:fiddle
Related:
Not sure if either query is also faster than my custom function below, because that only iterates over the whole table once.
Postgres 9.6
While stuck on your outdated version, create a custom set-returning function (once):
Call:
fiddle
Unlike the queries above, the array in
ids
is unsorted. You need to do more if you need that.Related: