I have the following sql query that is returning time spent logged in for users
select sum(duration) as time_spent, date_trunc('day', date)::date as date
from user_activities
where name = 'session'
and user_id = '#{user_id}'
group by date
order by 2
It’s working mostly fine, but I found some edge cases where users were logged in on a given day for more than 24 hours. When I dug into this, I discovered that one user had the following start/end time in their sessions on a given day which summed to more than 24hrs:
[2023-04-24 00:00:54.457183 UTC, 2023-04-24 00:02:01.388673 UTC],
[2023-04-24 00:13:22.536908 UTC, 2023-04-24 18:38:02.108004 UTC],
[2023-04-24 23:35:56.81623 UTC, 2023-04-25 20:24:20.60667 UTC]
so they were logged in at midnight for two minutes, logged in again from midnight to 6pm the next day and then again at 11pm until 8pm the next day. the query is summing up all of those times (which in fact does make sense, that’s what i’m asking it to do), but I’d like instead to sum up time from midnight to midnight so that this would make a duration of (approximately) 2 minutes (first session) + 1105 minutes (second session) + 25 minutes (last session, up to midnight of the given day).
The relevant columns of the user_activities
table looks like this:
:name=>"session",
:user_id=>"xxx",
:start_time=>2023-04-24 23:35:56.81623 UTC,
:end_time=>2023-04-25 20:24:20.60667 UTC,
:duration=>1,
:date=>#<Date: 2023-04-30 ((2460065j,0s,0n),+0s,2299161j)>,
The research I’ve done makes me thing I can use between
(such as here) but i haven’t found a way to make that work while using date_trunc
. I’m sure there is a way, but I’m a novice sql user at best and this is really giving me trouble. Thanks for any help!
2
Answers
The query below shows sample data of your user adding an other user with a session over two days (to demonstrate the principle)
The subquery
day_cnt
calculates the minimal start date of the sessions and thecount_days
that is covered with the sessions.This is used in subquery
cal
to generate a list of all dates in your data.This list of the dates is used to make the midnight split in the join in
dt2
, where also the corrected start and end timestamps are calculated (suffixed with2
)Finally the
extract ( epoch
is used to get the duration in minutes of the splitted sessions.Query with Sample Data
Output
The actual aggregation is left out as an exercise.
The following query returns the sum of the daily session times in seconds:
This query depends on all timestamps being in the same time zone. If mixed time zones are used, then the query should be modified to translate each timestamp to a common time zone.
A few notes regarding issues in the original question:
GROUP
andORDER BY
clauses to improve readability and maintainability.