I’m using PostgreSQL and I have a table
with the following columns: id
, distance
, and length
. I want to order the table by distance
and create a new column called cum_length
using a window function. I also want to filter the rows so that only rows until cum_length
value cross a certain threshold are included in the final result.
Example of input table
:
id | distance | length |
---|---|---|
1 | 10 | 1 |
2 | 5 | 2 |
3 | 8 | 1 |
4 | 1 | 3 |
5 | 3 | 2 |
6 | 9 | 2 |
Desired output for a threshold of 6
:
id | distance | length | cum_length |
---|---|---|---|
4 | 1 | 3 | 3 |
5 | 3 | 2 | 5 |
2 | 5 | 2 | 7 |
This is the SQL that I came up with:
WITH ordered_table AS (
SELECT id,
distance,
length,
SUM(length) OVER (ORDER BY distance) AS cum_length
FROM table)
SELECT *
FROM ordered_table
WHERE cum_length <= 6
But this omits the last row of the desired result.
3
Answers
Here is it based on your
ordered_table
query and an extralag
window function to calculateprevious_is_less
than the threshold value. Records whereprevious_is_less
is true or null qualify for selection.DB-Fiddle demo
Try the following:
For your sample data, this is equivalent to
WHERE cum_length <= 7
.See demo
May be interesting to compare with this one on large data set, only one sorting by distance and should stop scanning data as soon as the threshold is found