skip to Main Content

There is a table storing name, speed and timestamp of cars. Each car sends its speed at irregular time intervals, some send more often, some may have not sent their data for a couple of hours, simplified example (real table has 20 cars and roughly 3000 lines each):

name speed timestamp
yellow 100 1707532681
yellow 110 1707532661
yellow 120 1707532621
yellow 110 1707532631
yellow 140 1707532681
red 100 1707432681
red 120 1707332681
red 150 1707232681
red 170 1707532681

I would like to get the average speed for each car for their individual (!) last 24 hours. I am struggling to get that into a single SQL-statement. Have tried with multiple SELECT and IN, MAX and AVG functions without luck. Any help is appreciated.

I have tried to get the MAX of each car:
SELECT name, MAX(timestamp) AS s FROM data GROUP BY name

And put this into another SELECT:
SELECT AVG(speed) FROM data WHERE (name, timestamp) IN (SELECT name, MAX(timestamp) AS s FROM data GROUP BY name) AND timestamp > s – 86400

But the latter is breaking because s is unknown – of course it is. How do I get it into the other SELECT then? I need the MAX value of each timestamp because each car has its own last speed measurement from which it should go 24 hours back and calculate the average. Hope this makes it more clear.

2

Answers


  1. Assuming the table name is Table. Also assuming this is a MySQL database, because of the tag in your question.

    First let’s create a CTE where we create a new column where we convert from UNIX timestamp to a standard one, using the MySQL function FROM_UNIXTIME()

    WITH processed AS (
    SELECT name, speed, FROM_UNIXTIME(`timestamp`) AS report_timestamp
    FROM Table
    )
    

    Note: Since timestamp is a keyword, the column name will need to escaped using backticks.

    Now, since we have the data – including timestamp – in the format we need, let’s solve the problem. We can do this using Window Functions.

    Although the caveat here is that the average speed will be calculated for each row of the data.

    For this, as per your question, we will calculate the average speed, for each car, for the last 24 hours from the timestamp available on the current row. For this we will use the RANGE clause to specify the frame definition, which in this case is the previous 1 day.

    Let’s use the above CTE as the source table.

    SELECT 
     *,
     AVG(speed) OVER(
        PARTITION BY name
        ORDER BY report_timestamp ASC
        RANGE BETWEEN INTERVAL 1 DAY PRECEDING AND CURRENT ROW
       ) AS avg_speed_24h
    FROM processed;
    
    Login or Signup to reply.
  2. If you just want the name and the average speed for the 24 hours up to the most recent report, you can use GROUP BY and MAX to get the most recent timestamp and a correlated subquery for the average speed:

    SELECT
      name,
      (SELECT AVG(speed) FROM data WHERE name = d.name AND timestamp >= (MAX(d.timestamp) - 86400)) AS avg_speed_24h
    FROM data d
    GROUP BY name;
    

    Alternatively, if you want the full set of rows as provided by wandering-geek, there is no need to convert from unix timestamps:

    SELECT
      name,
      AVG(speed) OVER (
        PARTITION BY name
        ORDER BY `timestamp`
        RANGE BETWEEN 86400 PRECEDING AND CURRENT ROW
      ) AS avg_speed_24h
    FROM data;
    

    Here’s a db<>fiddle.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search