skip to Main Content

This is my data.

username building action timestamp
user-1 building-1 IN 2024-04-10 01:00:00.000
user-1 building-1 OUT 2024-04-10 02:00:00.000
user-1 building-1 IN 2024-04-10 02:30:00.000
user-1 building-1 OUT 2024-04-10 04:00:00.000
user-1 building-1 IN 2024-04-11 10:00:00.000
user-1 building-1 OUT 2024-04-11 11:00:00.000
user-2 building-2 IN 2024-04-10 08:00:00.000
user-2 building-2 OUT 2024-04-10 09:00:00.000
user-2 building-3 OUT 2024-04-11 02:30:00.000
user-2 building-4 IN 2024-04-11 04:00:00.000
user-2 building-1 IN 2024-04-12 10:00:00.000
user-2 building-1 OUT 2024-04-12 11:00:00.000

I need to calculate swipe in and swipe out time. Sometimes we received partial record and we need to include those records also.

My expected output is:

enter image description here

Followed using windowing functions like lead, lag. After using those functions, I’m not able to continue further to get this output.

attached table data:

select 'user-1' as username,'building-1' as building,'IN' as action,'2024-04-10 01:00:00'::timestamp as timestamp
union all 
select 'user-1' as username,'building-1' as building,'OUT' as action,'2024-04-10 02:00:00'::timestamp as timestamp
union all 
select 'user-1' as username,'building-1' as building,'IN' as action,'2024-04-10 02:30:00'::timestamp as timestamp
union all
select 'user-1' as username,'building-1' as building,'OUT' as action,'2024-04-10 04:00:00'::timestamp as timestamp
union all 
select 'user-1' as username,'building-1' as building,'IN' as action,'2024-04-11 10:00:00'::timestamp as timestamp
union all 
select 'user-1' as username,'building-1' as building,'OUT' as action,'2024-04-11 11:00:00'::timestamp as timestamp
union all 

select 'user-2' as username,'building-2' as building,'IN' as action,'2024-04-10 08:00:00'::timestamp as timestamp
union all 
select 'user-2' as username,'building-2' as building,'OUT' as action,'2024-04-10 09:00:00'::timestamp as timestamp
union all 
select 'user-2' as username,'building-3' as building,'OUT' as action,'2024-04-11 02:30:00'::timestamp as timestamp
union all
select 'user-2' as username,'building-4' as building,'IN' as action,'2024-04-11 04:00:00'::timestamp as timestamp
union all 
select 'user-2' as username,'building-1' as building,'IN' as action,'2024-04-12 10:00:00'::timestamp as timestamp
union all 
select 'user-2' as username,'building-1' as building,'OUT' as action,'2024-04-12 11:00:00'::timestamp as timestamp

)

select * from _data

2

Answers


  1. To calculate swipe in and swipe out times in PostgreSQL, you’ll typically use SQL queries to extract the relevant information from your database tables. This involves identifying the fields that store the swipe times and then manipulating the data using SQL functions to calculate the durations. If you provide the table structure and field names, I can help you draft a query tailored to your specific setup.

    Login or Signup to reply.
  2. You can use lead/lag along with some CASE/WHEN/END for this.

    Schema (PostgreSQL v15)

    CREATE TABLE swipes (
        "username"  VARCHAR(6),
        "building"  VARCHAR(10),
        "action"    VARCHAR(3),
        "timestamp" TIMESTAMP
    );
    
    INSERT INTO swipes
        ("username", "building", "action", "timestamp")
    VALUES ('user-1', 'building-1', 'IN', '2024-04-10 01:00:00.000')
         , ('user-1', 'building-1', 'OUT', '2024-04-10 02:00:00.000')
         , ('user-1', 'building-1', 'IN', '2024-04-10 02:30:00.000')
         , ('user-1', 'building-1', 'OUT', '2024-04-10 04:00:00.000')
         , ('user-1', 'building-1', 'IN', '2024-04-11 10:00:00.000')
         , ('user-1', 'building-1', 'OUT', '2024-04-11 11:00:00.000')
         , ('user-2', 'building-2', 'IN', '2024-04-10 08:00:00.000')
         , ('user-2', 'building-2', 'OUT', '2024-04-10 09:00:00.000')
         , ('user-2', 'building-3', 'OUT', '2024-04-11 02:30:00.000')
         , ('user-2', 'building-4', 'IN', '2024-04-11 04:00:00.000')
         , ('user-2', 'building-1', 'IN', '2024-04-12 10:00:00.000')
         , ('user-2', 'building-1', 'OUT', '2024-04-12 11:00:00.000')
         -- added those rows to showcase the results when the same action happens twice in a row
         , ('user-3', 'building-1', 'IN', '2024-04-12 11:00:00.000')
         , ('user-3', 'building-1', 'IN', '2024-04-12 12:00:00.000')
         , ('user-3', 'building-1', 'OUT', '2024-04-12 13:00:00.000')
    ;
    

    Query #1

    -- Since the information may be partial we can't simply look at all "in"s or "out"s,
    -- so we look at all the rows & end up with everything twice, hence the DISTINCT.
    SELECT DISTINCT
           username
         , building
         , CASE
               WHEN action = 'IN' THEN timestamp
               -- To find the entry time corresponding to this "out" row, we'll look at the previous row.
               -- If it's an "in", then we keep the timestamp.
               -- If it's an "out", then there were 2 "outs" in a row, so the "in" value is set to null (see user-3's rows for an example of this).
               WHEN action = 'OUT' THEN
                   CASE WHEN LAG(action) OVER user_actions_per_building_by_timestamp = 'IN'
                       THEN LAG(timestamp) OVER user_actions_per_building_by_timestamp
                   END
           END AS "in"
         , CASE
               -- same concept as above, with IN/OUT & LAG/LEAD flipped.
               WHEN action = 'IN' THEN
                   CASE WHEN LEAD(action) OVER user_actions_per_building_by_timestamp = 'OUT'
                       THEN LEAD(timestamp) OVER user_actions_per_building_by_timestamp
                   END
               WHEN action = 'OUT' THEN timestamp
        END AS "out"
    FROM swipes
    WINDOW user_actions_per_building_by_timestamp AS (PARTITION BY username, building ORDER BY timestamp)
    ORDER BY username, building, "in", "out";
    
    username building in out
    user-1 building-1 2024-04-10T01:00:00.000Z 2024-04-10T02:00:00.000Z
    user-1 building-1 2024-04-10T02:30:00.000Z 2024-04-10T04:00:00.000Z
    user-1 building-1 2024-04-11T10:00:00.000Z 2024-04-11T11:00:00.000Z
    user-2 building-1 2024-04-12T10:00:00.000Z 2024-04-12T11:00:00.000Z
    user-2 building-2 2024-04-10T08:00:00.000Z 2024-04-10T09:00:00.000Z
    user-2 building-3 2024-04-11T02:30:00.000Z
    user-2 building-4 2024-04-11T04:00:00.000Z
    user-3 building-1 2024-04-12T11:00:00.000Z
    user-3 building-1 2024-04-12T12:00:00.000Z 2024-04-12T13:00:00.000Z

    View on DB Fiddle


    Another way could be to use a recursive CTE, which might be easier to understand if the conditions get more complex (and it would also perform differently, though you would need to check this against some actual data).

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search