skip to Main Content

I’ve got a SQL query that selects every data between two dates and now I would like to add the time scale factor so that instead of returning all the data it returns one data every second, minute or hour.

Do you know how I can achieve it ?

My query :

"SELECT received_on, $1 FROM $2 WHERE $3 <= received_on AND received_on <= $4", [data_selected, table_name, date_1, date_2]

The table input:

Table Input

As you can see there are several data the same second, I would like to select only one per second

2

Answers


  1. If you want to select data every second, you may use ROW_NUMBER() function partitioned by ‘received_on’ as the following:

    WITH DateGroups AS
    (
      SELECT *, ROW_NUMBER() OVER (PARTITION BY received_on ORDER BY adc_v) AS rn
      FROM table_name
    )
    SELECT received_on, adc_v, adc_i, acc_axe_x, acc_axe_y, acc_axe_z 
    FROM DateGroups
    WHERE rn=1 
    ORDER BY received_on 
    

    If you want to select data every minute or hour, you may use the extract function to get the number of seconds in ‘received_on’ and divide it by 60 to get the minutes or divide it by 3600 to get the hours.

    epoch: For date and timestamp values, the number of seconds since 1970-01-01 00:00:00-00 (can be negative); for interval values, the total number of seconds in the interval

    Group by minutes:

    WITH DateGroups AS
    (
      SELECT *, ROW_NUMBER() OVER (PARTITION BY floor(extract(epoch from (received_on)) / 60) ORDER BY adc_v) AS rn
      FROM table_name
    )
    SELECT received_on, adc_v, adc_i, acc_axe_x, acc_axe_y, acc_axe_z 
    FROM DateGroups
    WHERE rn=1  
    ORDER BY received_on
    

    Group by hours:

    WITH DateGroups AS
    (
      SELECT *, ROW_NUMBER() OVER (PARTITION BY floor(extract(epoch from (received_on)) / (60*60)) ORDER BY adc_v) AS rn
      FROM table_name
    )
    SELECT received_on, adc_v, adc_i, acc_axe_x, acc_axe_y, acc_axe_z 
    FROM DateGroups
    WHERE rn=1  
    ORDER BY received_on
    

    See a demo.

    Login or Signup to reply.
  2. When there are several rows per second, and you only want one result row per second, you can decide to pick one of the rows for each second. This can be a randomly chosen row or you pick the row with the greatest or least value in a column as shown in Ahmed’s answer.

    It would be more typical, though, to aggregate your data per second. The columns show figures and you are interested in those figures. Your sample data shows two times the value 2509 and three times the value 2510 for the adc_v column at 2022-07-29, 15:52. Consider what you would like to see. Maybe you don’t want this value go below some boundary, so you show the minimum value MIN(adc_v) to see how low it went in the second. Or you want to see the value that occured most often in the second MODE(adc_v). Or you’d like to see the average value AVG(adc_v). Make this decision for every value, so as to get the informarion most vital to you.

    select
      received_on,
      min(adc_v),
      avg(adc_i),
      ...
    from mytable
    group by received_on
    order by received_on;
    

    If you want this for another interval, say an hour instead of the month, truncate your received_on column accordingly. E.g.:

    select
      date_trunc('hour', received_on) as received_hour,
      min(adc_v),
      avg(adc_i),
      ...
    from mytable
    group by date_trunc('hour', received_on)
    order by date_trunc('hour', received_on);
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search