I’m trying to calculate the time weighted average using LOCF in a 24 hour interval using the following data:
Time | Value |
---|---|
2020-01-01 00:00:00 | 1 |
2020-01-01 23:00:00 | 1000 |
The query looks as follows:
SELECT time_bucket('1 days', time) AS timebucket,
average(time_weight(
'locf',
time,
value
))
FROM series_points_float
WHERE (time BETWEEN TIMESTAMP '2020-01-01 00:00:00+00:00' AND TIMESTAMP'2020-01-02 00:00:00+00:00')
GROUP BY timebucket
The result is just 1. Can someone please point out why the second data point is not taken into consideration?
When calculating the weighted average for a 24 hour interval my calculation would be: 1 * 23/24 + 1000 * 1/24
which equals 42.625.
2
Answers
I think you are right; I’ve opened: https://github.com/timescale/timescaledb-toolkit/issues/732
When calculating a standard average from a time_weight, we don’t know the duration of the whole period we’re calculating it over, so we can’t extend a value to the end of a bucket, you have to use
interpolated_average
for that, though it’s still experimental and the docs are slightly off right now, but basically the call would look like this:Unfortunately, it’s very difficult to pass in the information from the separate time_bucket function to the average function so that it knows that it should go all the way to edge of the bucket, so we have a separate function that can give that information, null arguments can be used to provide information from before and after in order to do interpolation more accurately if you didn’t have the initial data point at the very beginning of the interval.