Help with terminology would be appreciated to help me rephrase the question I am trying to ask in my post.
The Problem:
I am using TimeScale DB and am trying to create a time aggregate with a 4 hour time bucket interval. I want the 4 hour intervals to calculate outside of the scope of the same day. So the first timestamp for the day would be an from 10PM the previous day to 2AM the new day. The default behavior is to create the intervals starting at midnight of the same day. Is there a way I can change this?
Output to further explain
This is the source table of 5 minute data.
GBPUSD | 2024-02-04 22:15:00+00 | 1.26291 | 1.26291 | 1.2626 | 1.26275
GBPUSD | 2024-02-04 22:10:00+00 | 1.26345 | 1.26355 | 1.26278 | 1.26297
GBPUSD | 2024-02-04 22:05:00+00 | 1.26355 | 1.26376 | 1.26344 | 1.26344
GBPUSD | 2024-02-02 21:55:00+00 | 1.26339 | 1.26348 | 1.26302 | 1.26302
Note, that the first timestamp for Feb 4 is 10:05PM
This is the 4 hour interval time aggregate:
GBPUSD | 2024-02-05 04:00:00+00 | 1.26096 | 1.26238 | 1.26034 | 1.26167
GBPUSD | 2024-02-04 20:00:00+00 | 1.26355 | 1.2639 | 1.26113 | 1.26168
GBPUSD | 2024-02-02 20:00:00+00 | 1.26387 | 1.2643 | 1.26302 | 1.26302
Instead of the 8PM and 4AM I want them to fall on 2AM then 6AM and so on for Feb. 5.
The Code:
This is how I am creating the aggregate.
CREATE MATERIALIZED VIEW forex4hour
WITH (timescaledb.continuous) AS
SELECT symbol,
time_bucket(INTERVAL '4 hour', timestamp) AS _4h,
first(open, timestamp) as open,
max(high) as high,
min(low) as low,
last(close, timestamp) as close
FROM forex5m
GROUP BY symbol, _4h;
SELECT add_continuous_aggregate_policy('forex4hour',
start_offset => INTERVAL '2 year',
end_offset => INTERVAL '1hour' ,
schedule_interval => INTERVAL '1 hour');
2
Answers
I had to look at this for a long time but I got it to work.... There does not seem to be any documentation to reflect this BUT to get the origin to work in the create statement:
time_bucket(INTERVAL '4 hour', timestamp, 'UTC', '2020-01-03 02:00:00')
The time_bucket function has an optional parameter (origin), which defines the origin of the calculated time buckets and in your case defaults to midnight on January 3, 2000. So, to shift the start point of the time buckets, you have to set origin accordingly, in your case to 2AM (or 10PM, there shouldn’t be a difference).
For hour based intervals, the date part of origin doesn’t matter. It could be any valid date.
Also, since you are creating a materialized view, you may have to consider timezone settings, to get the desired results (depending on your server and clients timezone settings). There is another optional parameter (timezone) to control that (see).
Update
As Welchums pointed out and until timescaledb provides a fix for that, it is not posiible to use the time_bucket() function with an origin parameter, for the definition of a continuous aggregate.