skip to Main Content

Lets say I have a table like this:

ma_id act_date
1 2023-01-01
1 2023-01-02
1 2023-01-03
1 2023-01-05
1 2023-01-06
2 2023-02-08
2 2023-02-09

I have read a lot of guides but couldn’t find what I was looking for…
I want a result like this:

ma_id start_date end_date
1 2023-01-01 2023-01-03
1 2023-01-05 2023-01-06
2 2023-02-08 2023-02-09

to select a new row each time a date is missing…

I have an sql I have tried here but it is designed for datetime-format and not just date-format….:

select *,min(act_date),max(act_date) from (select t.*,sum(case when prev_act_date >= act_date then 0 else 1 end) over(partition by ma_id,date_format(act_date,'%d-%m-%Y') order by act_date) grp from (select t.*,lag(act_date) over(partition by ma_id,date_format(act_date,'%d-%m-%Y') order by act_date) prev_act_date from XXXX.XXXX t where t.ma_id in (1,2)) t) t group by ma_id,date_format(act_date,'%d-%m-%Y'),grp order by min(act_date)

2

Answers


  1. Assuming MySQL 8+:

    WITH cte AS (
        SELECT *, DATEDIFF(act_date,
                           LAG(act_date, 1, act_date - INTERVAL 1 DAY) OVER
                               (PARTITION BY ma_id ORDER BY act_date)) - 1 AS diff
        FROM yourTable
    ),
    cte2 AS (
        SELECT *, SUM(diff) OVER (PARTITION BY ma_id ORDER BY act_date) AS grp
        FROM cte
    )
    
    SELECT ma_id, MIN(act_date) AS start_date, MAX(act_date) AS end_date
    FROM cte2
    GROUP BY ma_id, grp
    ORDER BY 1, 2;
    

    The first CTE computes the difference in days between adjacent records. We use the long form of the LAG() window function, to ensure that the first record has a date difference of zero. The second CTE sums this date difference to form a pseudo group. Appreciate that the group number only changes when there is a date gap greater than one. Finally, we aggregate by ma_id and pseudo group to find the start and end dates.

    Login or Signup to reply.
  2. I will try to explain step by step(using row_number() function):

    1. add a fixed date column to each row, like ‘19700101’
    select yt.ma_id, yt.act_date, '19700101' from yourTable yt;
    

    add fixed column

    2. add column diff of datediff from act_date to ‘19700101’
    select yt.ma_id, yt.act_date, '19700101', datediff(date(yt.act_date), '19700101') as diff from yourTable yt;
    

    add diff

    3. add column using row_number(partition by yt.ma_id order by yt.act_date) function to add another column to result
     select yt.ma_id, yt.act_date, '19700101', datediff(date(yt.act_date), '19700101') as diff, row_number() over (partition by yt.ma_id order by date(yt.act_date)) as row_num from yourTable yt;
    

    add row_num

    4. calculate diffrow_num, we interested in column diff_minus_row_num
     select yt.ma_id, yt.act_date, '19700101', datediff(date(yt.act_date), '19700101') as diff, row_number() over (partition by yt.ma_id order by date(yt.act_date)) as row_num, datediff(date(yt.act_date), '19700101')  -  row_number() over (partition by yt.ma_id order by date(yt.act_date)) as diff_minus_row_num from yourTable yt;
    

    minus row num

    5. use group by and min, max to get the results:

     select outside.ma_id, min(act_date) as 'start_date', max(act_date) as 'end_date' from (select yt.ma_id, yt.act_date, '19700101', datediff(date(yt.act_date), '19700101') as diff, row_number() over (partition by yt.ma_id order by date(yt.act_date)) as row_num, datediff(date(yt.act_date), '19700101')  -  row_number() over (partition by yt.ma_id order by date(yt.act_date)) as diff_minus_row_num from yourTable yt) outside group by outside.diff_minus_row_num order by 1,2;
    

    group by outside

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search