I’m trying to group the rows of a table based on the time difference between them.
If the records differ by less than 30 days (time interval), they must be grouped together; otherwise, they are considered separate events. The start will be equal to the start date of the first record, and the end will be equal to the end date of the last record.
For example: The following table
id | start | end |
---|---|---|
1 | 2023-08-10 | 2023-09-01 |
1 | 2023-09-20 | 2023-10-14 |
1 | 2023-11-20 | 2023-12-27 |
2 | 2023-08-15 | 2023-09-05 |
2 | 2023-09-20 | 2023-10-14 |
2 | 2023-11-01 | 2023-12-20 |
should become like this:
id | start | end |
---|---|---|
1 | 2023-08-10 | 2023-10-14 |
1 | 2023-11-20 | 2023-12-27 |
2 | 2023-08-15 | 2023-12-20 |
For the first case, the first two records should be grouped together as:
start_2n_row – end_1st_row = 2023-09-20 – 2023-09-01 = 19 days
though the 3rd record should be considered seperate event.
I tried to test the LAG but the grouping was not working properly.
2
Answers
Try this query, it groups rows in a PostgreSQL table according to a 30-day time period using recursion. For each group inside of each id, it determines the earliest start date and most recent end date. When using the query, substitute your_table for the name of your table.
Hope it works 🙂
try this