skip to Main Content

We have

A wide-format PostgreSQL table of time series data from various sensors:

datetime s_1 s_n Note
20230101T10:00:00Z 3.00 null
20230101T11:00:00Z 2.00 10.0
20230101T12:00:00Z 1.00 null
20230101T23:00:00Z 4.00 null next day in tz=Europe/Berlin!
20230102T10:00:00Z 2.00 null
20230102T10:00:00Z 1.00 9.3

datetime is a UTC-timestamp and s_i are floats. Marked as bold are the sought-after
values. In practice, n is a few 100.

We want

An SQL query (preferably one, avoiding intermediate tables) which yields a table in
long format containing for each sensor s_i its maximal (non-null) value for each day
in local time (Europe/Berlin, +01:00/+02:00 wrt UTC during normal/DST time):

datetime variable value
20230101T10:00:00Z s_1 3.00
20230101T23:00:00Z s_1 4.00
20230101T11:00:00Z s_n 10.00
20230102T10:00:00Z s_n 9.3

Less-optimal alternative would be the same table in wide format:

datetime s_1 s_n Note
20230101T10:00:00Z 3.00 null
20230101T11:00:00Z null 10.0 non-maximal value for s_1 set to null!
20230101T23:00:00Z 4.00 null
20230102T10:00:00Z null 9.3 non-maximal value for s_1 set to null!

What we tried

Actually a lot. We got closes to a solution with the query

SELECT 
    date (timezone('Europe/Berlin', datetime))::timestamptz as datetime, 
    MAX (s_1) as s_1_max, 
    MAX (s_n) as s_n_max
FROM table
GROUP BY date (timezone('Europe/Berlin', datetime))

which yields a wide-format table of this form:

datetime s_1_max s_n_max Note
20230101T00:00:00Z 3.00 10.0 one row per day, time lost
20230102T00:00:00Z 4.00 9.3 one row per day, time lost

Yet, the time information is lost (all times set to 00:00:00) here, and all sought-after
maximal values are aggregated in one row.

2

Answers


  1. dbfiddle demo

    select datetime, col, val 
    from (
        select datetime, col, val, 
               rank() over (partition by dt, col order by val desc) rnk 
        from ( 
            select datetime,
                   date(timezone('Europe/Berlin', datetime)) dt,
                   unnest(array['s_1', 's_2']) as col,
                   unnest(array[s_1, s_2]) as val
            from tbl t) unpivoted
        where val is not null ) ranked
    where rnk = 1
    
    Login or Signup to reply.
  2. When doing:

    select date(timezone('Europe/Berlin','20230101T10:00:00Z'))::timestamptz;
    

    You will convert the datetime value to a date, which looses time (2023-01-01 00:00:00+01).

    do this:

    select (timezone('Europe/Berlin','20230101T10:00:00Z'))::timestamptz;
    

    results in: 2023-01-01 11:00:00+01, which is the converted (date)time in Berlin-timezone.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search