INSERT INTO "public"."catalog_element"
("id", "catalogue_id", "element_id", "weight") VALUES
(1,100,1,0),
(2,100,2,1),
(3,100,3,2),
(4,10,1,0),
(5,10,5,0),
(6,10,6,1),
(7,100,7,1);
Let say, we have a Postgres table of catalog *- to -* element
matching rows that are weighted, like such
id | catalogue_id | element_id | weight |
---|---|---|---|
1 | 100 | 1 | 0 |
2 | 100 | 2 | 1 |
3 | 100 | 3 | 2 |
4 | 10 | 1 | 0 |
5 | 10 | 5 | 0 |
6 | 10 | 6 | 1 |
7 | 100 | 7 | 1 |
I try to write a query that will return records bounded to certain catalog_id
, ordered by weight, in a range between first to one that have certain element_id
.
Like this where catalog_id = 100
ordered DESC
by weight and every rows up until element_id = 7
meet , which end up with this result:
id | catalogue_id | element_id | weight |
---|---|---|---|
3 | 100 | 3 | 2 |
2 | 100 | 2 | 1 |
7 | 100 | 7 | 1 |
5
Answers
I figure out the way to do what I desire, but it's to clunky
To select ordered rows up to a certain ID on the data in the given insert statement, you can use the following SQL query:
Replace <max_id> with the maximum ID value that you want to include in the results. This query will select all rows from the catalog_element table where the ID value is less than or equal to the specified maximum ID, and order the results first by catalogue_id in ascending order, then by weight in ascending order, and finally by element_id in ascending order.
For example, if you want to select all rows with an ID value less than or equal to 5, the query would be:
This would return the following rows:
You can use a subquery to get the weight wanted and the select the column that are bigger than that weight
fiddle
I’m not sure if you consider my solution less clunky.
It does avoid the self join, which should result in better performance for large tables.
yields
With the following setup
While statements like this seem intimidating they are actually not that hard to understand.
You used analytic or windowing functions yourself, so I assume basic knowledge of their working.
since you can’t directly nest such functions I use nested selects, where each nesting level adds a function. Therefore such statements should be read from the inside out.
The inner most one
wf
for with_flag adds aflag
column usingmarking the cut off row. This happens in its own subselect, since
we need the flag twice in the next step.
In
wmf
for with_max_flag we usemax(flag) over( order by weight desc, flag)
to create themax_flag
column that has a0
for all rows before the cut off row. Not that we order byflag
as well to have0
also for those rows with the same weight as the cut off row.We can now almost filter by that row, but we want to include the row cut off row, so we us
lag
to basically shift themax_flag
by one row, and usecoalesce
to turnnull
values into0
.This happens in
wlmf
for with_lagged_max_flag.We finally filter out only the rows that we want.
The resulting SQL statement looks clumsy, but it does access the data only once and orders by the same criteria all the time.
My experience with similar and actually way bigger statements is that they perform extremely well.
View on DB Fiddle
Using LATERAL to compute max relevant weight. Take all the rows with greater weight and only rows till element_id = 7 having exactly that weight. May return more than one row for element_id = 7. Define an extra criteria to select a single element_id = 7 row.