I have a table that stores data about events with id
, start_timestamp
(timestamp with time zone), end_timestamp
(timestamp with time zone) and duration
(in seconds):
create table my_table(
id bigint generated by default as identity primary key
, start_timestamp timestamptz
, end_timestamp timestamptz
, duration bigint);
insert into my_table values
(0, '2024-10-01 03:00:00+00', '2024-10-01 15:00:00+00', 43200)
,(1, '2024-10-02 05:00:00+00', '2024-10-03 17:00:00+00', 129600)
,(2, '2024-10-04 12:00:00+00', '2024-10-07 09:45:00+00', 251100);
Now the data I need to get from it is The duration of each event during working hours, which means 9:00 – 18:00, Monday to Friday.
Below are the expected results from each row with an explanation:
Event ID #0
id | start_timestamp | end_timestamp | duration | working_hours_duration |
---|---|---|---|---|
0 | 2024-10-01 03:00:00+00 | 2024-10-01 15:00:00+00 | 43200 | 21600 |
The event started before working hours, the time from 2024-10-01 03:00:00+00
should be ignored, and the difference between 2024-10-01 09:00:00+00
– 2024-10-01 15:00:00+00
should be calculated. The expected result would be 21600
(seconds).
Event ID #1
id | start_timestamp | end_timestamp | duration | working_hours_duration |
---|---|---|---|---|
1 | 2024-10-02 05:00:00+00 | 2024-10-03 17:00:00+00 | 129600 | 61200 |
Now the event not only started before working hours, but also ended the next day, meaning that a gap between two working days should also be excluded. So, the expected result would be all working hours range for 2024-10-02
, which would be 09:00 to 18:00, and 09:00 till 17:00 for the 2024-10-03
. It totals up to 17h
, so the expected result would be 61200
(seconds).
Event ID #2
id | start_timestamp | end_timestamp | duration | working_hours_duration |
---|---|---|---|---|
2 | 2024-10-04 12:00:00+00 | 2024-10-07 09:45:00+00 | 35820 | 24300 |
This one is the most complicated, as it also includes weekend, which should be subtracted from the duration.
So, the event starts at Friday (2024-10-04
) 12:00. So all the time till the end of that working day should be included (18:00 - 12:00 = 6 hours
). Then the weekend starts, so everything till start of the work hours on Monday (2024-10-07
) 09:00 should be ignored. And on Monday the duration would be 09:45:00 - 09:00:00 = 00:45
. All of it should be summed up, so the result is 6h45'
= 24300
(seconds).
Question:
Is it something that’s possible to achieve solely on PostgreSQL side, by a query, or should I simply fetch the raw data and process it in the backend instead?
5
Answers
date_trunc('week',start_timestamp)
gets you a Monday modnight of the week your range starts.tstzrange()
, setting the bounds by adding the adequate+'9:00'::time
to that Monday midnight.Cross join lateral generate_series(0,4) as day_steps
spawns 5 rows for each of your input rows. Adding that to the reference Monday midnight gives the working hours on each workday.range_agg()
to reduce all of thesetstzrange
s to a singletstzmultirange
that represents all working hour ranges on all workdays of each week your input rows stretch over.demo at db<>fiddle
You can intersect your inputs with their corresponding workday ranges with a
*
operator:Unnest()
the resulting intersection of ranges andsum()
theirupper()-lower()
widths.interval
to number of seconds represented byinteger
, but I’d just keep the more flexible and adequateinterval
.Ranges and multiranges in combination with a calender can solve this issue in plain SQL:
In this example I used generate_series(), better to replace this by a table that holds a complete calendar including holidays etc.
It’s the multirange operator * that does all the work to compute the intersection.
First, divide the multi-day intervals into rows of 1 day in each row, using generate_series(…).
Then we will join it to the work schedule table (timetable) on week day.
See example
fiddle
Another spin of the same idea, trying to avoid a whole calendar in the
tstzmultirange
: multiply whole days in the range that don’t fall on weekends by'9 hours'
, then only check the intersection of the start and end timestamps with workdays, if they happen on them:demo at db<>fiddle
I would first create a helper function that splits the interval
end_timestamp - start_timestamp
into daily intervals. It might be useful for other cases too.Then use a scalar subquery for
working_hours_duration
.DB Fiddle demo