Postgresql get next X rows for each result

LeandroGuimar227es
April 12, 2023
226 views
3 votes
4 Answers

I have the following data:

sale_id	sale_date	installments	total_value
1	2023/01/01	2	100.0
1	2023/02/01	2	0.0
1	2023/03/01	2	0.0
1	2023/04/01	3	90.0
1	2023/05/01	3	0.0
1	2023/06/01	3	0.0
2	2023/01/01	1	100.0

I need to divide the total_value of a sale_id by the next X installments. Example: In this scenario, the first row should be 50.00 and the second row also 50.00, as following:

sale_id	sale_date	installments	total_value
1	2023/01/01	2	50.0
1	2023/02/01	2	50.0
1	2023/03/01	2	0.0
1	2023/04/01	3	30.0
1	2023/05/01	3	30.0
1	2023/06/01	3	30.0
2	2023/01/01	1	100.0

My first idea was to get everything that total_vale > 0 and for each row get the next X rows based on the installments column, but I can’t find a way to get it with single SQL query.

Maybe I’m missing something simple, but someone could help me with ideas? I’ve tried also to do that via Function in PLpgsql, but I didn’t have a good performance with this approach.

I can’t just update everything once, since the customer needs to keep this original layout of data.

Answers

As Atmo suggested. installments-row_number() with the max(total_value) per sale_id and installments should divide it correctly.

select  sale_id
       ,sale_date
       ,installments
       ,case when installments-row_number() over(partition by sale_id, flg order by sale_date) >= 0 then max(total_value) over(partition by sale_id, flg)/installments else 0 end as total_value     
from 
(
select  *
       ,count (case when total_value > 0 then 1 end) over(partition by sale_id order by sale_date) as flg 
from    t
) t

sale_id	sale_date	installments	total_value
1	2023-01-01 00:00:00	2	50
1	2023-02-01 00:00:00	2	50
1	2023-03-01 00:00:00	2	0
1	2023-04-01 00:00:00	3	30
1	2023-05-01 00:00:00	3	30
1	2023-06-01 00:00:00	3	30
2	2023-01-01 00:00:00	1	100

Fiddle

- Atmo
- April 13, 2023 at 12:04 am
- 0 votes
0
There exists a different approach from the other answers (and my original idea in comments) that needs no window function. It uses generate_series.

As a bonus, it supports installment periods overlapping (but it works just fine if there aren’t any) and will generate rows if the existing ones are not enough to store all installments.

On that note, considering the records with total_value = 0 are basically unnecessary with this method, you could consider removing them, making the table lighter and faster (in case of a FULL SCAN) to access. I guess that depends on the constraints you talk about in your question.

Use the below SELECT query:
```
SELECT sale_id, sale_date, MAX(installments), SUM(total_value) AS total_value
FROM (
    SELECT sale_id, generate_series(sale_date, sale_date + make_interval(months => installments - 1), interval '1 month')::date, installments, total_value / installments
    FROM MyTable
) T(sale_id, sale_date, installments, total_value)
GROUP BY sale_id, sale_date
ORDER BY sale_id, sale_date
```
I was forced to do something strange with the installments column and maybe having the additional records is not what you want.
To "restore" the behavior of the original table on this points, do a JOIN (INNER JOIN restores installments and removes the additional records, LEFT OUTER JOIN only restores installments):
```
SELECT T.sale_id, T.sale_date, MyTable.installments, SUM(T.total_value) AS total_value
FROM (
    SELECT sale_id, generate_series(sale_date, sale_date + make_interval(months => installments - 1), interval '1 month')::date, installments, total_value / installments
    FROM MyTable
) T(sale_id, sale_date, installments, total_value)
JOIN MyTable ON MyTable.sale_id = T.sale_id AND MyTable.sale_date = T.sale_date
GROUP BY T.sale_id, T.sale_date, MyTable.installments
ORDER BY T.sale_id, T.sale_date
```
You can use either of them to create a view.
However, if your target is to update the table, you do not have to care about the installments column (it will not be updated) nor the additional records (an UPDATE does not create records), so the UPDATE query is:
```
UPDATE MyTable
SET total_value = T2.total_value
FROM (
    SELECT sale_id, sale_date, SUM(total_value) AS total_value
    FROM (
        SELECT sale_id, generate_series(sale_date, sale_date + make_interval(months => installments - 1), interval '1 month')::date, installments, total_value / installments
        FROM MyTable
    ) T1(sale_id, sale_date, installments, total_value)
    GROUP BY sale_id, sale_date
) T2
WHERE MyTable.sale_id = T2.sale_id AND MyTable.sale_date = T2.sale_date
```
Just be careful with it (i.e. start a transaction that you can rollback) as this query cannot be repeated if anything goes wrong.
Login or Signup to reply.

Create new group whenever new non zero installment appears, then use this groups for further analysis:

dbfiddle demo

with grps as (
  select sale_id, sale_date, installments, total_value,
         sum(case when total_value <> 0 then 1 end) over (partition by sale_id 
                                                          order by sale_date) grp
  from data)
select sale_id, sale_date, installments, total_value, grp,
       case when row_number() over (partition by sale_id, grp order by sale_date) 
              <= installments
            then max(total_value) over (partition by sale_id, grp) / installments 
            else 0
       end val
from grps

Data will be divided using the cte into groups based on the sale_id and the number of installments, and each group will be given a different number using the row_number function. Additionally, we will get the maximum total_value for each group.

Then the total_value per group will be max_total_value/installments where the row_number <= installments

with cte as (
  select *, 
         row_number() over (partition by sale_id, installments order by sale_id, sale_date) as rn,
         row_number() over (order by sale_id, sale_date) - row_number() over (partition by sale_id, installments order by sale_id, sale_date) as grp,
         max(total_value) over (partition by sale_id, installments) as max_total_value
  from mytable
)
select sale_id, sale_date, installments,
       max(case when rn <= installments 
                then max_total_value/installments 
                else 0 end
          ) over (partition by grp, rn) as t_value
from cte c1

Demo here

After reading the comments and getting some remarks from @Atmo I conclude that there its could be the taken :

with cte as (
  select *, 
         sum(case when total_value <> 0 then 1 end) over (partition by sale_id order by sale_date) as grp
  from mytable
),
cte2 as (
  select *, max(total_value) over (partition by grp, sale_id) as max_total_value,
            row_number() over (partition by grp, sale_id order by sale_date) as rn
  from cte
)
select sale_id, sale_date, installments,
       max(case when rn <= installments 
                then max_total_value/installments 
                else 0 end
          ) over (partition by grp, rn, sale_id) as t_value
from cte2
order by sale_id, sale_date

First CTE will create new group whenever new non zero installment appears.

Second CTE will get max value and row number by the 0 based group.

Demo here

Please signup or login to give your own answer.

Click here to cancel reply.