skip to Main Content

i have a table that looks like this:

id   position    value
5    senior      10000
6    senior      20000
8    senior      30000
9    junior      5000
4    junior      7000
3    junior      10000

It is sorted by position and value (asc) already. I want to calculate the number of seniors and juniors that can fit in a budget of 50,000 such that preference is given to seniors.

So for example, here 2 seniors (first and second) + 3 juniors can fit in the budget of 50,000.

id   position    value     cum_sum
5    senior      10000     10000
6    senior      20000     30000
8    senior      30000     60000   ----not possible because it is more than 50000
-----------------------------------   --- so out of 50k, 30k is used for 2 seniors.
9    junior      5000      5000 
4    junior      7000      12000
1    junior      7000      19000 ---with the remaining 20k, these 3 juniors can also fit
3    junior      10000     29000

so the output should look like this:

juniors    seniors
3          2

how can i achieve this in sql?

4

Answers


  1. This example of using a running total:

    select 
    count(case when chek_sum_jun > 0 and position = 'junior'  then position else null end) chek_jun,
    count(case when chek_sum_sen > 0 and position = 'senior' then  position else null end) chek_sen
        from (
        select position, 
        20000 - sum(case when position = 'junior' then value else 0 end) over (partition by position order by value asc rows between unbounded preceding and current row )  chek_sum_jun,
        50000 - sum(case when position = 'senior' then value else 0 end) over (partition by position order by value asc rows between unbounded preceding and current row )  chek_sum_sen
        from test_table) x
    

    demo : https://dbfiddle.uk/ZgOoSzF0

    Login or Signup to reply.
  2. postgresql supports window SUM(col) OVER()

    with cte as (
      SELECT *, SUM(value) OVER(PARTITION BY position ORDER BY id) AS cumulative_sum
      FROM mytable
    )
    select position, count(1)
    from cte
    where cumulative_sum < 50000
    group by position
    

    An other way to do it to get results in one row :

    with cte as (
      SELECT *, SUM(value) OVER(PARTITION BY position ORDER BY id) AS cumulative_sum
      FROM mytable
    ),
    cte2 as (
      select position, count(1) as _count
      from cte
      where cumulative_sum < 50000
      group by position
    )
    select
    sum(case when position = 'junior' then _count else null end) juniors,
    sum(case when position = 'senior' then _count else null end) seniors
    from cte2
    

    Demo here

    Login or Signup to reply.
  3. For it to work, the sum has to be not only cumulative, but also selective. As mentioned in the comment, you can achieve that with a recursive cte: online demo

    with recursive 
     ordered as --this will be fed to the actual recursive cte
    (   select *,
               row_number() over (order by position desc,value asc) 
        from test_table)
    ,recursive_cte as 
    ( select id,
             position,
             value, 
             value*(value<50000)::int as cum_sum,
             value<50000 as is_hired,
             2 as next_i
      from ordered
      where row_number=1
      union
      select o.id,
             o.position,
             o.value, 
             case when o.value+r.cum_sum<50000 then o.value+r.cum_sum else r.cum_sum end,
             (o.value+r.cum_sum)<50000 as is_hired,
             r.next_i+1 as next_i
      from recursive_cte r, 
           ordered o
      where o.row_number=next_i
    )
    select count(*) filter (where position='junior') as juniors,
           count(*) filter (where position='senior') as seniors
    from recursive_cte 
    where is_hired;
    
    • row_number() over () is a window function
    • count(*) filter (where...) is an aggregate filter. It’s a faster variant of the sum(case when expr then a else 0 end) or count(nullif(expr)) approach, for when you only wish to sum a specific subset of values. That’s just to put those in columns as you did in your expected result, but it could be done with a select position, count(*) from recursive_cte where is_hired group by position, stacked.

    All it does is order your list according to your priorities in the first cte, then go through it row by row in the second one, collecting the cumulative sum, based on whether it’s still below your limit/budget.

    Login or Signup to reply.
  4. Here’s one possible solution: DB Fiddle

    with seniorsCte as (
      select id, position, value, total
      from budget b
      inner join (
        select id, position, value, (sum(value) over (order by value, id)) total
        from people
        where position = 'senior'
      ) as s 
      on s.total <= b.amount
    )
    , juniorsCte as (
      select j.id, j.position, j.value, j.total + r.seniorsTotal
      from (
        select  coalesce(max(total), 0) seniorsTotal
        , max(b.amount) - coalesce(max(total), 0) remainingAmount
        from budget b
        cross join seniorsCte
      ) as r
      inner join (
        select id, position, value, (sum(value) over (order by value, id)) total
        from people
        where position = 'junior'
      ) as j
      on j.total <= r.remainingAmount
    )
    /* use this if you want the specific records
    select *
    from seniorsCte 
    union all
    select *
    from juniorsCte
    */
    select (select count(1) from seniorsCte) seniors
    , (select count(1) from juniorsCte) juniors
    

    From your question I suspect you’re familiar with window functions; but in case not; the below query pulls back all rows from the people table where the position is senior, and creates a column, total which is our cumulative total of the value of the rows returned, starting with the lowest value, ascending (then sorting by id to ensure consistent behaviour if there’s multiple rows with the same value; though that’s not strictly required if we’re happy to get those in an arbitrary order).

    select id, position, value, (sum(value) over (order by value, id)) total
    from people
    where position = 'senior'
    

    The budget table I just use to hold a single row/value saying what our cutoff is; i.e. this avoids hardcoding the 50k value you mentioned, so we can easily amend it as required.

    The common table expressions (CTEs) I’ve used to allow us to filter our juniors subquery based on the output of our seniors subquery (i.e. as we only want those juniors up to the difference between the budget and the senior’s total), whilst allowing us to return the results of juniors and seniors independently (i.e. if we wanted to return the actual rows, rather than just totals, this allows us to perform a union all between the two sets; as demonstrated in the commented out code.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search