I want to write a Postgres function that looks up the value of a column in a given table.
Example setup:
CREATE SCHEMA IF NOT EXISTS example;
CREATE TABLE example.price_table (
symbol text NOT NULL,
time_of timestamptz NOT NULL,
price float8 null
volume float8 null );
CREATE TABLE example.interesting_times (
symbol text NOT NULL,
time_of timestamptz NOT NULL );
Sample data
INSERT INTO example.price_table VALUES
('GME', '2016-01-01 00:00:00', 10.0, 10),
('GME', '2016-01-02 00:00:00', 11.0, 3),
('GME', '2016-01-03 00:00:00', 9.0 , 7),
('GME', '2016-01-03 00:40:00', 10.5, 76),
('GME', '2016-01-04 00:00:00', 11.0, 10),
('BBBY', '2016-01-01 00:00:00', 20.0, 5),
('BBBY', '2016-01-02 02:30:00', 2123.0, 1),
('BBBY', '2016-01-02 03:30:00', 2133.0, 1),
('BBBY', '2016-01-02 04:30:00', 2144.0, 100),
('BBBY', '2016-01-03 00:00:00', 29.0, 999),
('BBBY', '2016-01-03 03:40:00', 20.5, 1),
('BBBY', '2016-01-03 04:40:00', 20.6, 54),
('BBBY', '2016-01-04 00:00:00', 21.0, 34),
('BBBY', '2016-01-06 00:00:00', 666.0, 1);
INSERT INTO example.interesting_times VALUES
('GME', '2016-01-01 00:00:00'),
('GME', '2016-01-02 00:30:00'),
('GME', '2016-01-03 05:00:00'),
('GME', '2016-01-03 00:40:00'),
('GME', '2016-01-04 00:00:00'),
('BBBY', '2016-01-01 01:00:00'),
('BBBY', '2016-01-02 00:00:00'),
('BBBY', '2016-01-03 00:00:00'),
('BBBY', '2016-01-03 07:40:00'),
('BBBY', '2016-01-04 00:00:00');
While querying the interesting_times
table I want to add the earliest market price from the markout_price_table
table, at least 3 hours, but not more than 3 hours + 24 hours into the future. Ideally, I would just call a markout_price_table
function like:
select symbol
, time_of
, markout_price_table( symbol
, time_of
, 'price'
, '3 hours'
, '24 hours') as price_3_hours_later
from example.interesting_times
order by symbol, time_of;
I could get different markout times by adding:
markout_price_table(symbol, time_of, 'price', '6 hours', '24 hours') as price_6_hours_later`
or
markout_price_table(symbol, time_of, 'volume', '0 hours', '24 hours') as volume_at_time
Is it possible to encapsulate the logic in a function like this? If so, would it be as fast as getting all of the data downloaded and doing the merges locally?
I can get the desired result of the above using a few merges but I don’t like this as a solution as I would like to encapsulate good logic for the merges rather than copy pasting the below lots of times into each query.
with tabA as (
select a.symbol as symbol
, a.time_of as time_of
, b.price as price_3_hours_later
, b.time_of - (a.time_of + '3 hours') as timeliness
from example.interesting_times a
left join example.price_table b
on (a.symbol = b.symbol)
and (a.time_of + '3 hours' < b.time_of)
and (a.time_of + '3 hours' + '24 hours' > b.time_of)
order by b.time_of - (a.time_of + '3 hours')
)
, tabB as (
select symbol
, time_of
, price_3_hours_later
, timeliness
, min(timeliness) over w1 as min_tim
from tabA
window w1 as (partition by symbol, time_of order by timeliness)
)
select symbol
, time_of
, price_3_hours_later
from tabB
where (timeliness = min_tim)
or (timeliness is null)
order by symbol, time_of;
So the result should look like:
Symbol time_of price_3_hours_later
BBBY 2016-01-01 01:00:00.000 +0000 2123.0
BBBY 2016-01-02 00:00:00.000 +0000 2133.0
BBBY 2016-01-03 00:00:00.000 +0000 20.5
BBBY 2016-01-03 07:40:00.000 +0000 21.0
BBBY 2016-01-04 00:00:00.000 +0000 NULL
GME 2016-01-01 00:00:00.000 +0000 11.0
GME 2016-01-02 00:30:00.000 +0000 9.0
GME 2016-01-03 00:40:00.000 +0000 11.0
GME 2016-01-03 05:00:00.000 +0000 11.0
GME 2016-01-04 00:00:00.000 +0000 NULL
2
Answers
You can shorten that query and wrap it in a function: demo at db<>fiddle
And that would work exactly how you wanted:
Since you put
price
right in the name of the function, I don’t think it makes sense to pass'price'
as a parameter.If you wanted the function to dynamically select the
price
column, it would complicate things a bit. You would have to make it a polymorphic function and pass a sample value into into it, so that it can infer its return type at call time.Can be solved with a single, plain SQL statement using a
LATERAL
subquery:This needs an index on
price_table (symbol, time_of)
to be fast. Ideally:You can wrap this into a SQL function if you insist:
Call:
Zegarek already provided a valid drop-in version of the function. But be sure to add appropriate function labels to not hinder performance:
Call:
fiddle
Actually, either solution (including Zegarek’s) will be substantially faster, as long as you have that index.
Related: