skip to Main Content

I have the following data:

sale_id sale_date installments total_value
1 2023/01/01 2 100.0
1 2023/02/01 2 0.0
1 2023/03/01 2 0.0
1 2023/04/01 3 90.0
1 2023/05/01 3 0.0
1 2023/06/01 3 0.0
2 2023/01/01 1 100.0

I need to divide the total_value of a sale_id by the next X installments. Example: In this scenario, the first row should be 50.00 and the second row also 50.00, as following:

sale_id sale_date installments total_value
1 2023/01/01 2 50.0
1 2023/02/01 2 50.0
1 2023/03/01 2 0.0
1 2023/04/01 3 30.0
1 2023/05/01 3 30.0
1 2023/06/01 3 30.0
2 2023/01/01 1 100.0

My first idea was to get everything that total_vale > 0 and for each row get the next X rows based on the installments column, but I can’t find a way to get it with single SQL query.

Maybe I’m missing something simple, but someone could help me with ideas? I’ve tried also to do that via Function in PLpgsql, but I didn’t have a good performance with this approach.

I can’t just update everything once, since the customer needs to keep this original layout of data.

4

Answers


  1. As Atmo suggested. installments-row_number() with the max(total_value) per sale_id and installments should divide it correctly.

    select  sale_id
           ,sale_date
           ,installments
           ,case when installments-row_number() over(partition by sale_id, flg order by sale_date) >= 0 then max(total_value) over(partition by sale_id, flg)/installments else 0 end as total_value     
    from 
    (
    select  *
           ,count (case when total_value > 0 then 1 end) over(partition by sale_id order by sale_date) as flg 
    from    t
    ) t
    
    sale_id sale_date installments total_value
    1 2023-01-01 00:00:00 2 50
    1 2023-02-01 00:00:00 2 50
    1 2023-03-01 00:00:00 2 0
    1 2023-04-01 00:00:00 3 30
    1 2023-05-01 00:00:00 3 30
    1 2023-06-01 00:00:00 3 30
    2 2023-01-01 00:00:00 1 100

    Fiddle

    Login or Signup to reply.
  2. There exists a different approach from the other answers (and my original idea in comments) that needs no window function. It uses generate_series.

    As a bonus, it supports installment periods overlapping (but it works just fine if there aren’t any) and will generate rows if the existing ones are not enough to store all installments.

    On that note, considering the records with total_value = 0 are basically unnecessary with this method, you could consider removing them, making the table lighter and faster (in case of a FULL SCAN) to access. I guess that depends on the constraints you talk about in your question.

    Use the below SELECT query:

    SELECT sale_id, sale_date, MAX(installments), SUM(total_value) AS total_value
    FROM (
        SELECT sale_id, generate_series(sale_date, sale_date + make_interval(months => installments - 1), interval '1 month')::date, installments, total_value / installments
        FROM MyTable
    ) T(sale_id, sale_date, installments, total_value)
    GROUP BY sale_id, sale_date
    ORDER BY sale_id, sale_date
    

    I was forced to do something strange with the installments column and maybe having the additional records is not what you want.
    To "restore" the behavior of the original table on this points, do a JOIN (INNER JOIN restores installments and removes the additional records, LEFT OUTER JOIN only restores installments):

    SELECT T.sale_id, T.sale_date, MyTable.installments, SUM(T.total_value) AS total_value
    FROM (
        SELECT sale_id, generate_series(sale_date, sale_date + make_interval(months => installments - 1), interval '1 month')::date, installments, total_value / installments
        FROM MyTable
    ) T(sale_id, sale_date, installments, total_value)
    JOIN MyTable ON MyTable.sale_id = T.sale_id AND MyTable.sale_date = T.sale_date
    GROUP BY T.sale_id, T.sale_date, MyTable.installments
    ORDER BY T.sale_id, T.sale_date
    

    You can use either of them to create a view.
    However, if your target is to update the table, you do not have to care about the installments column (it will not be updated) nor the additional records (an UPDATE does not create records), so the UPDATE query is:

    UPDATE MyTable
    SET total_value = T2.total_value
    FROM (
        SELECT sale_id, sale_date, SUM(total_value) AS total_value
        FROM (
            SELECT sale_id, generate_series(sale_date, sale_date + make_interval(months => installments - 1), interval '1 month')::date, installments, total_value / installments
            FROM MyTable
        ) T1(sale_id, sale_date, installments, total_value)
        GROUP BY sale_id, sale_date
    ) T2
    WHERE MyTable.sale_id = T2.sale_id AND MyTable.sale_date = T2.sale_date
    

    Just be careful with it (i.e. start a transaction that you can rollback) as this query cannot be repeated if anything goes wrong.

    Login or Signup to reply.
  3. Create new group whenever new non zero installment appears, then use this groups for further analysis:

    dbfiddle demo

    with grps as (
      select sale_id, sale_date, installments, total_value,
             sum(case when total_value <> 0 then 1 end) over (partition by sale_id 
                                                              order by sale_date) grp
      from data)
    select sale_id, sale_date, installments, total_value, grp,
           case when row_number() over (partition by sale_id, grp order by sale_date) 
                  <= installments
                then max(total_value) over (partition by sale_id, grp) / installments 
                else 0
           end val
    from grps
    
    Login or Signup to reply.
  4. Data will be divided using the cte into groups based on the sale_id and the number of installments, and each group will be given a different number using the row_number function. Additionally, we will get the maximum total_value for each group.

    Then the total_value per group will be max_total_value/installments where the row_number <= installments

    with cte as (
      select *, 
             row_number() over (partition by sale_id, installments order by sale_id, sale_date) as rn,
             row_number() over (order by sale_id, sale_date) - row_number() over (partition by sale_id, installments order by sale_id, sale_date) as grp,
             max(total_value) over (partition by sale_id, installments) as max_total_value
      from mytable
    )
    select sale_id, sale_date, installments,
           max(case when rn <= installments 
                    then max_total_value/installments 
                    else 0 end
              ) over (partition by grp, rn) as t_value
    from cte c1
    

    Demo here

    After reading the comments and getting some remarks from @Atmo I conclude that there its could be the taken :

    with cte as (
      select *, 
             sum(case when total_value <> 0 then 1 end) over (partition by sale_id order by sale_date) as grp
      from mytable
    ),
    cte2 as (
      select *, max(total_value) over (partition by grp, sale_id) as max_total_value,
                row_number() over (partition by grp, sale_id order by sale_date) as rn
      from cte
    )
    select sale_id, sale_date, installments,
           max(case when rn <= installments 
                    then max_total_value/installments 
                    else 0 end
              ) over (partition by grp, rn, sale_id) as t_value
    from cte2
    order by sale_id, sale_date
    

    First CTE will create new group whenever new non zero installment appears.

    Second CTE will get max value and row number by the 0 based group.

    Demo here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search