Given a table called budgets
containing a rate
and a time_span
I want to generate a monthly daily rate for each day. The time_span
must have a start date but can be open ended ["2023-06-02 00:00:00+00",)
.
For example given the following dataset:
rate
: $10time_span
:["2023-07-19 00:00:00+00",)
Assume the current day is 2023-09-05 00:00:00+00
I would expect the query result to show:
2023-07-19 -> 2023-08-18
arate
of$10
should be divided 30 way for a daily rate of$0.333
2023-08-19 -> 2023-09-05
arate
of$10
should be divided 17 days for a daily rate of$0.588
Attempt 1:
I have first attempted to solve this by generating a series from the time_span
then dividing the rate against the date_part
interval of one month. This is flawed as the time_bucket
in the date_part
does not account for the correct number of days within its monthly window.
SELECT
CAST(time_bucket AS TIMESTAMP) AT TIME ZONE 'America/New_York' AS time_bucket,
rate / DATE_PART('days', time_bucket + INTERVAL '1 month - 1 day') AS daily_rate
FROM (
SELECT
DATE(generate_series(LOWER(time_span)::TIMESTAMP, COALESCE(UPPER(time_span),NOW())::TIMESTAMP, '1 day')) AS time_bucket,
rate
FROM budgets
) AS daily_rates;
How do I generate the daily rate on a monthly basis for this budgets table?
2023-07-19
–0.333
2023-07-20
–0.333
2023-07-21
–0.333
- ….
2023-08-19
–0.588
2023-08-20
–0.588
- ….
3
Answers
I didn’t adhere exactly to your structure for simplicity and it wasn’t clear exactly how you wanted the output presented. But I threw together a SQL Fiddle to show the basic structure of a possible solution. http://sqlfiddle.com/#!17/d52a4/14
The essential bit
Using a recursive
cte
:See fiddle
fiddle
Note the rate for the first month of your example is 0.323 not
0.333, since August has 31 days, not 30.Generates a series of months in the first lateral subquery
m
Calculate the number of day in this rate period of a month or shorter:
mon_days
Generate as many rows as there are days in
mon_days
in the second lateral function calli
– effectively just shorthand forCROSS JOIN LATERAL
. See:Compute the display date
the_day
and itsdaily_rate
in the outerSELECT
.There is a lot of fine print to this kind of calculation:
To avoid undefined corner cases and ambiguities with time zones vs. local time and daylight saving time, I build on the data type
daterange
fortime_span
instead oftstzrange
like your sample data suggests.Like your sample data suggests, I base monthly slices off the actual start date in
timespan
, not the first of the month.I assume you want to end the time series with the current date, so I use
LEAST
instead ofCOALESCE
. (And anotherLEAST
for a different reason.)I divide the rate by the actual number of days in each monthly slice.
rate
should be type numeric to avoid rounding errors and work withround()
out of the box.I assume all columns
NOT NULL
.Note that
date - date → int
anddate + int → date
.I base the first
generate_series()
call ontimestamp
, notdate
, because:Your actual query will use the actual current date (or timestamp) instead of the given, fixed 2023-09-05 for "today":