i have a table that looks like this:
id position value
5 senior 10000
6 senior 20000
8 senior 30000
9 junior 5000
4 junior 7000
3 junior 10000
It is sorted by position and value (asc) already. I want to calculate the number of seniors and juniors that can fit in a budget of 50,000 such that preference is given to seniors.
So for example, here 2 seniors (first and second) + 3 juniors can fit in the budget of 50,000.
id position value cum_sum
5 senior 10000 10000
6 senior 20000 30000
8 senior 30000 60000 ----not possible because it is more than 50000
----------------------------------- --- so out of 50k, 30k is used for 2 seniors.
9 junior 5000 5000
4 junior 7000 12000
1 junior 7000 19000 ---with the remaining 20k, these 3 juniors can also fit
3 junior 10000 29000
so the output should look like this:
juniors seniors
3 2
how can i achieve this in sql?
4
Answers
This example of using a running total:
demo : https://dbfiddle.uk/ZgOoSzF0
postgresql supports window SUM(col) OVER()
An other way to do it to get results in one row :
Demo here
For it to work, the sum has to be not only cumulative, but also selective. As mentioned in the comment, you can achieve that with a recursive cte: online demo
row_number() over ()
is a window functioncount(*) filter (where...)
is an aggregate filter. It’s a faster variant of thesum(case when expr then a else 0 end)
orcount(nullif(expr))
approach, for when you only wish to sum a specific subset of values. That’s just to put those in columns as you did in your expected result, but it could be done with aselect position, count(*) from recursive_cte where is_hired group by position
, stacked.All it does is order your list according to your priorities in the first cte, then go through it row by row in the second one, collecting the cumulative sum, based on whether it’s still below your limit/budget.
Here’s one possible solution: DB Fiddle
From your question I suspect you’re familiar with window functions; but in case not; the below query pulls back all rows from the
people
table where theposition
issenior
, and creates a column,total
which is our cumulative total of thevalue
of the rows returned, starting with the lowest value, ascending (then sorting by id to ensure consistent behaviour if there’s multiple rows with the same value; though that’s not strictly required if we’re happy to get those in an arbitrary order).The
budget
table I just use to hold a single row/value saying what our cutoff is; i.e. this avoids hardcoding the50k
value you mentioned, so we can easily amend it as required.The common table expressions (CTEs) I’ve used to allow us to filter our juniors subquery based on the output of our seniors subquery (i.e. as we only want those juniors up to the difference between the budget and the senior’s total), whilst allowing us to return the results of juniors and seniors independently (i.e. if we wanted to return the actual rows, rather than just totals, this allows us to perform a
union all
between the two sets; as demonstrated in the commented out code.