Lets say I have a table like this:
ma_id | act_date |
---|---|
1 | 2023-01-01 |
1 | 2023-01-02 |
1 | 2023-01-03 |
1 | 2023-01-05 |
1 | 2023-01-06 |
2 | 2023-02-08 |
2 | 2023-02-09 |
I have read a lot of guides but couldn’t find what I was looking for…
I want a result like this:
ma_id | start_date | end_date |
---|---|---|
1 | 2023-01-01 | 2023-01-03 |
1 | 2023-01-05 | 2023-01-06 |
2 | 2023-02-08 | 2023-02-09 |
to select a new row each time a date is missing…
I have an sql I have tried here but it is designed for datetime-format and not just date-format….:
select *,min(act_date),max(act_date) from (select t.*,sum(case when prev_act_date >= act_date then 0 else 1 end) over(partition by ma_id,date_format(act_date,'%d-%m-%Y') order by act_date) grp from (select t.*,lag(act_date) over(partition by ma_id,date_format(act_date,'%d-%m-%Y') order by act_date) prev_act_date from XXXX.XXXX t where t.ma_id in (1,2)) t) t group by ma_id,date_format(act_date,'%d-%m-%Y'),grp order by min(act_date)
2
Answers
Assuming MySQL 8+:
The first CTE computes the difference in days between adjacent records. We use the long form of the
LAG()
window function, to ensure that the first record has a date difference of zero. The second CTE sums this date difference to form a pseudo group. Appreciate that the group number only changes when there is a date gap greater than one. Finally, we aggregate byma_id
and pseudo group to find the start and end dates.I will try to explain step by step(using row_number() function):
1. add a fixed date column to each row, like ‘19700101’
2. add column diff of
datediff
from act_date to ‘19700101’3. add column using row_number(partition by yt.ma_id order by yt.act_date) function to add another column to result
4. calculate
diff
–row_num
, we interested in columndiff_minus_row_num
5. use
group by
andmin
,max
to get the results: