skip to Main Content

I am trying to identify sum of quantity for each group if date ranges are overlapping. I am using postgresql to solve for it.
For example

id group start_date end_date quantity
1 a 2020-09-11 2020-10-09 50
1 a 2020-09-11 2020-10-31 20
1 a 2020-11-01 2020-12-01 7
1 a 2020-11-15 2020-11-20 6
2 b 2020-10-06 2020-10-30 10
2 b 2020-10-09 2022-10-17 5
2 b 2020-10-15 2022-10-26 3

What I am trying to achieve is the following:
Expected Output: "edited"

id group start_date end_date quantity
1 a 2020-09-11 2020-10-31 70
1 a 2020-11-01 2020-12-01 13
2 b 2020-10-06 2020-10-30 18

would appreciate your help!

2

Answers


  1. SOLUTION 1 : when all the date intervals within the same group are overlapping

    SELECT id
         , group
         , min(start_date) AS start_date
         , max(end_date) AS end_date
         , Sum(quantity) AS quantity
      FROM mytable
    GROUP BY id, group
    

    SOLUTION 2 : when all the date intervals within the same group are not overlapping

    Using a window function would be great but I don’t see how to specify the window which corresponds to a subset of date ranges which overlap. So I come back to a basic solution based on a self-join which select for every row of table a the subset of rows of table b which overlap with that row :

    SELECT DISTINCT ON (a.id, date_start, date_end)
           a.id
         , min(b.start_date) AS date_start
         , max(b.end_date) AS date_end
         , sum(b.quantity) AS quantity
      FROM mytable AS a
     INNER JOIN mytable AS b
        ON a.id = b.id
       AND a.start_date < b.end_date
       AND a.end_date > b.start_date
     GROUP BY a.id, a.start_date, a.end_date
     ORDER BY a.id, date_start
    

    see the test result in dbfiddle

    Login or Signup to reply.
  2. You may use the LAG function to check the consecutive overlapped date periods and create unique groups for them as the following:

    WITH create_groups AS
    (
      SELECT *, SUM(chk) OVER (PARTITION BY id, group_ ORDER BY start_date) grp
      FROM
       (
         SELECT *,
            CASE 
             WHEN start_date <= LAG(end_date, 1, end_date) OVER (PARTITION BY id, group_ ORDER BY start_date) AND
                  start_date >=LAG(start_date, 1, start_date) OVER (PARTITION BY id, group_ ORDER BY start_date) 
             THEN 0 ELSE 1
            END AS chk
         FROM table_name
      ) T
    )
    SELECT id, group_, 
           MIN(start_date) start_date,
           MAX(end_date) start_date,
           SUM(quantity) quantity
    FROM create_groups
    GROUP BY id, group_, grp
    ORDER BY id, group_, grp
    

    See a demo.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search