skip to Main Content

This is my sample dataset. Seq_no and value columns are given, Cum.sum and Bucket columns we need to find using postgresql. To Find cumulative sum , i can write sum(value) over (order by seq_no rows between unbounded preceding and current row) . But ask is reset cum.sum if value reach threshold value. For example Threshold value is 20 and sum of the row 1,2 and 3 values are 23 which is greater than 20 . so we reset the window function. The next threshold value calculate from seq_no 4 onwards .

enter image description here

2

Answers


  1. try using this sql query for calculating the cumulative sum based on threshold value and resetting the sun after reaching the threshold value each time.

    WITH recursive cte AS (
      SELECT
        seq_no,
        value,
        CASE
          WHEN value > 20 THEN 1
          ELSE 0
        END AS reset,
        value AS cum_sum,
        value AS bucket
      FROM your_table
      WHERE seq_no = 1
      
      UNION ALL
      
      SELECT
        t.seq_no,
        t.value,
        CASE
          WHEN cte.cum_sum + t.value > 20 THEN cte.reset + 1
          ELSE cte.reset
        END,
        CASE
          WHEN cte.cum_sum + t.value > 20 THEN t.value
          ELSE cte.cum_sum + t.value
        END,
        cte.reset + 1
      FROM your_table t
      JOIN cte ON t.seq_no = cte.seq_no + 1
    )
    SELECT seq_no, cum_sum, bucket FROM cte
    ORDER BY seq_no;
    
    Login or Signup to reply.
  2. Couldn’t figure out a way to do it with pure SQL, but here’s a plpgsql routine that works:

    CREATE TABLE public.test(seq_no integer, val numeric);
    
    INSERT INTO public.test(seq_no, val) VALUES (1, 11),(2, 6),(3, 6),(4, 6),(5, 13),(6, 6),(7, 15),(8, 6),(9, 19),(10, 10);
    
    CREATE OR REPLACE FUNCTION public.test_cumolative_sum(arg_threshold integer)
    RETURNS TABLE (seq_number integer, running_val NUMERIC, cum_sum NUMERIC, bucket integer)
    LANGUAGE plpgsql AS
    $$
    DECLARE
        var_table record;
        var_cum_sum NUMERIC;
        var_bucket integer;
    BEGIN
        var_cum_sum := 0;
        var_bucket  := 1;
        FOR var_table IN SELECT seq_no, val FROM public.test ORDER BY seq_no LOOP
            var_cum_sum := var_cum_sum + var_table.val;
            RETURN query
            SELECT
                var_table.seq_no,
                var_table.val,
                var_cum_sum,
                var_bucket;
            
            IF var_cum_sum >= arg_threshold THEN
                var_cum_sum := 0;
                var_bucket := var_bucket + 1;
            END IF;
        END LOOP;
    END;
    $$;
    
    SELECT * FROM public.test_cumolative_sum(20);
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search