skip to Main Content

Input: consecutive numbers from 1 to 100 – 1,2,3,…,100

Desired output:

  • 1st group: 1, 2, 3, 4, 5 -> sum 15 (not bigger than 20)
  • 2nd group: 6, 7 -> sum 13 (not bigger than 20)
  • 3rd group: 8, 9 -> sum 17 (not bigger than 20)
  • 4th group: 10 -> sum 10 (not bigger than 20)
  • 5th group: 11 -> sum 11 (not bigger than 20)
  • nth-1 group: 99 -> sum 99 (bigger than 20, but only one element)
  • nth group: 100 -> sum 100 (bigger than 20, but only one element)
SELECT array_agg(n)
FROM generate_series(1, 100) as n
GROUP BY ... -- TODO -> Here's go magic grouping or other magic tricks

So I have consecutive numbers from 1 to 100, and I would like to group them in that way that sum in group is not bigger than 20 (if single number is bigger than 20 than new group with one element).

Edit:
Max number (in the example 100) and max group’s sum (in the example 20) are dynamic, so I cannot use hard coded solution

2

Answers


  1. I started doing this:

      SELECT ARRAY_AGG(g) as g, MAX(gs) as gs, MAX(g) as m
      FROM (
      SELECT
        generate_series g,
        SUM(generate_series) OVER (ORDER BY generate_series) as gs
      FROM generate_series(1,100)
      )
      WHERE gs < 20
    

    which gives:

    g gs m
    {1,2,3,4,5} 15 5

    After the i tried creating a recursive CTE using above query, and only changing the 1 in generate_series(1,100)

    After seeing lots of NULL values, and a reboot of my PostgreSQL server because of a memory error 😕😉… (probably caused bu programming mistakes on my site), I ended up with this (do note that some conditions are there for safety, to avoid those NULL values and huge memory use)

    WITH RECURSIVE numbers as (
       select generate_series as gg
       from generate_series(1,100)
    )
    ,so AS (
      SELECT ARRAY_AGG(g) as g, MAX(gs) as gs, MAX(g) as m
      FROM (
      SELECT
        gg g,
        SUM(gg) OVER (ORDER BY gg) as gs
      FROM numbers
      )
      WHERE gs < 20
      UNION 
      SELECT ARRAY_AGG(g) as g, MAX(gs) as gs, MAX(g) as m
      FROM (
      SELECT
        generate_series g,
        SUM(generate_series) OVER (ORDER BY generate_series) as gs
      FROM (SELECT m+1 m FROM so WHERE m<30 and not m is null) as mmm
      CROSS JOIN generate_series(mmm.m,30)
      )
      WHERE gs<20 and not g is null
    )
    ,so2 as (
        select array[generate_series] as g1,generate_series as g2
        from generate_series((select max(m)+1 from so ),100)
    )
    select g,gs 
    from so 
    where not gs is null
    union all 
    select g1,g2
    from so2
    ;
    

    see: DBFIDDLE

    result:

    g gs
    {1,2,3,4,5} 15
    {6,7} 13
    {8,9} 17
    {10} 10
    {11} 11
    {12} 12
    {13} 13
    {14} 14
    {15} 15
    {16} 16
    {17} 17
    {18} 18
    {19} 19
    {20} 20
    {21} 21
    {99} 99
    {100} 100

    EDIT: A simpler, but more hard coded solution would be:

    select 
       ARRAY_AGG(g) a, MAX(s)
    from (
    select 
      generate_series g,
      case when generate_series>=10 then generate_series 
                                                   when generate_series<=5 then 1 
                                                   when generate_series<=7 then 2
                                                   when generate_series<=9 then 3
                                                   else 0
                                                   END as k,
      SUM(generate_series) over (partition by case when generate_series>=10 then generate_series 
                                                   when generate_series<=5 then 1 
                                                   when generate_series<=7 then 2 
                                                   when generate_series<=9 then 3
                                                   else 0
                                                   END 
                                 order by generate_series) as s
    from
      generate_series(1,100)
    order by generate_series
    )
    group by k
    order by a;
    
    Login or Signup to reply.
  2. If the sequence is continuous, you can walk another generate_series() back a few numbers, keeping track of the current sum, then aggregate up to the element that exceeds it: demo

    create function groups_up_to_sum(range_start int, range_end int, group_sum_limit int)
      returns table(n int,group_ int[])
    as $f$
    select n,(array_agg(x)filter(where sum_<=group_sum_limit))
    from generate_series(range_start,range_end)n,
    lateral (select n,0 union all
             select x,n+sum(x)over(order by x desc)
             from generate_series(n-1,greatest(n-group_sum_limit,1),-1)x
             ) as g(x,sum_)
    group by n order by n; 
    $f$ language sql;
    
    select * from groups_up_to_sum(5,11,20);
    
    n group_
    5 {5,4,3,2,1}
    6 {6,5,4,3,2}
    7 {7,6,5}
    8 {8,7}
    9 {9,8}
    10 {10,9}
    11 {11}
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search