I have an unusual problem I’m trying to solve with SQL where I need to generate sequential numbers for partitioned rows but override specific numbers with values from the data, while not breaking the sequence (unless the override causes a number to be used greater than the number of rows present).
I feel I might be able to achieve this by selecting the rows where I need to override the generated sequence value and the rows I don’t need to override the value, then union
ing them together and somehow using coalesce
to get the desired dynamically generated sequence value, or maybe there’s some way I can utilise recursive
.
I’ve not been able to solve this problem yet, but I’ve put together a SQL Fiddle which provides a simplified version:
http://sqlfiddle.com/#!17/236b5/5
The desired_dynamic_number
is what I’m trying to generate and the generated_dynamic_number
is my current work-in-progress attempt.
Any pointers around the best way to achieve the desired_dynamic_number
values dynamically?
Update:
I’m almost there using lag
:
3
Answers
The real world problem I was trying to solve did not have a nicely ordered
secondary_order_by
column, instead it would be something a bit more randomised (a created timestamp).For the benefit of people who stumble across this question with a similar problem to solve, a colleague solved this problem using a cartesian join, who's solution I'm posting below. The solution is Snowflake SQL which should be possible to adapt to Postgres. It does fall down on higher
override_as_number
values though unless thefrom table(generator(rowcount => 1000))
1000 value is not increased to something suitably high.The SQL:
In the end I went for a simpler solution using a temporary table and cursor to inject
override_as_number
values and shuffle other numbers.Below query gives exact result, but you need to verify with all combinations
column "final" is the result
Hope this helps!
step-by-step demo:db<>fiddle
override_as_number <> NULL
followed byNULL
records. So, for instance, your(AAA, d)
to(AAA, f)
belongs to the same subpartition/group.first_value()
gives the first value of such subpartition.COALESCE
ensures a non-NULL
result from thefirst_value()
function if your partition starts with aNULL
record.row_number() - 1
creates a row count within a subpartition, starting with0
.first_value()
of a subpartition with the row count creates your result: Beginning with the one non-NULL
record of a subpartition (adding the0
row count), the first followingNULL
records results in the value +1 and so forth.