I spend a lot of time solving this problem. So, i have table users:
at_timestamp | users_count | event_type |
---|---|---|
100000 | 2 | 1 |
100001 | 4 | 1 |
100003 | 5 | 0 |
100005 | 1 | 1 |
100006 | 3 | 1 |
100008 | 2 | 0 |
100008 | 1 | 1 |
etc.
So, event type 1 means IN, and 0 means OUT. I need to find timestamp, when the sum of users was the higest. In this example it was 100001, because 2 + 4 = 6. Then 5 are out and 1 left. Then 1 user in, 3 users are in, which means total users count at the 100006 ts was 5. Then in the same time, 2 leave and 1 in and so on. I am have tried to use window functions, but without IF/ELSE, or something, but it doesn’t work for me. I got something close to, but not exactly what i need.
Example of my postgres query
SELECT at_timestamp, sum(users_count) OVER (PARTITION BY event_type ORDER BY at_timestamp)
FROM users_events;
Then i have tried
SELECT at_timestamp, sum(users_count) OVER (partition by event_type ORDER BY at_timestamp) - LAG(users_count) OVER (PARTITION BY event_type) FROM users_events;
But it doesn’t work, because i need to subtract previous value, if it is out event. So, i need SQL query which will calculate it right 😅
I will be very grateful for your help!
3
Answers
Because of the potential for ties in the ranking, this uses
dense_rank()
to filter for maximal output rows.Short and concise: a CASE WHEN multiplier as from @shawnt00 in a running sum expression using a window clause. I use a named window because I find it more readable and PostgreSQL allows it.
And finally order by the running sum descending and applying a LIMIT 1:
You can do this arithmetically with your initial approach, if you get rid of the
partition by
: demoThat’s because
event_type
seems to be aninteger
2 * 0 - 1
is-1
and2 * 1 - 1
is1
Using this as a multiplier,
event_type
of0
causes subtraction of correspondingusers_count
from the stepping sum, andevent_type
of1
results in addition of theusers_count