I have a trade history table that looks like this:
CREATE TABLE trade (
trade_id INT PRIMARY KEY,
stock_id INT NOT NULL,
date DATE NOT NULL,
sell BOOLEAN NOT NULL,
shares INT NOT NULL,
usd_total DECIMAL(10, 2) NOT NULL
);
I want to query my trade record for a tax report. Something that result in
stock_id | buy_date | sell_date | shares | buy_usd_total | sell_usd_total |
---|---|---|---|---|---|
meaning each sell trade would be matched with one or more rows, each time "spending" shares from the buy rows then moving on to the next.
For example trades of
trade_id | stock_id | date | sell | shares | usd_total |
---|---|---|---|---|---|
1 | 1 | 2021-01-01 | 0 | 2 | 100.00 |
2 | 1 | 2021-01-02 | 0 | 2 | 120.00 |
3 | 1 | 2021-01-03 | 1 | 3 | 300.00 |
4 | 1 | 2021-01-04 | 1 | 1 | 150.00 |
should result in
stock_id | buy_date | sell_date | shares | buy_usd_total | sell_usd_total |
---|---|---|---|---|---|
1 | 2021-01-01 | 2021-01-03 | 2 | 100.00 | 200.00 |
1 | 2021-01-02 | 2021-01-03 | 1 | 60.00 | 100.00 |
1 | 2021-01-02 | 2021-01-04 | 1 | 60.00 | 150.00 |
Is this something that’s possible with SQL? I’ve tried to wrap my head around it, but keeping score of the remaining shares of buys and sells is difficult. Should I push forward or go with an iterative approach in python?
2
Answers
You can enumerate individual share purchase/sale events and self-join to match the 1st purchase to the 1st sale, 2nd purchase to the 2nd sale and so on: demo at db<>fiddle
count(*)over w1
uses the window(partition by stock_id,sell order by date,n)
. It’ll keep a separate running/stepping/tumbling count for eachstock_id
, also separating sales from purchases, following the buy/sell eventdate
.cross join lateral
generate_series(1,shares)
clones eachtrade
row as many times as there wereshares
in the event, to produce a separate sale/purchase row for each individual share. Theusd_total/shares
establishes their per-share price.My answer is going to be much more verbose than the answer by Zegarek but I believe it works correctly with data beyond the sample dataset. I use recursion in the initial CTE to break out the individual shares for each transaction then aggregate them in terms of buys and sells so I can match them back up per-share to get not only the requested output, but also things like days held and gain/loss.
Fiddle: https://dbfiddle.uk/I7-8vTr3
The result set is the same as Zegarek’s: