skip to Main Content

I’ve data like below:

ID   Task          Time
1    X started     T1
2    X ended       T2
3    X started     T3    [wrong entry in data]
4    X started     T4
5    X ended       T5 
6    Y started     T6    [wrong entry in data]
7    Y started     T7
8    Y ended       T8 

And, I need to get the data from above in started/ended fashion, but in case of wrong entry I need to pickup the latest one [as T4>T3 and T7>T6].
How can I write SQL on above dataset to get below result ?

ID   Task          Time
1    X started     T1
2    X ended       T2
4    X started     T4
5    X ended       T5 
7    Y started     T7
8    Y ended       T8 

2

Answers


  1. You may use the difference of two row_number()s to define unique groups for the consecutive duplicate values of Task column, then get the max of Time and ID columns, try the following:

    select max(ID), Task, max(Time) Time
    from
    (
      select *,
        row_number() over (order by Time) -
        row_number() over (partition by Task order by Time) grp
      from table_name
    ) T
    group by Task, grp
    order by max(Time)
    

    See a demo.

    For MySQL 5.7 try the following:

    set @gr=0;
    set @ts=null;
    
    select max(ID), Task, max(Time) 
    Time
    from
    (
      select *,
        if(@ts<>Task, @gr:=@gr+1, 
    @gr) grp,
        @ts:=Task
      from table_name
      order by Time
    ) T
    group by Task, grp
    order by max(Time)
    

    Demo.

    Login or Signup to reply.
  2. use a subquery in the where to match the ID, the subquery returns the ID of the row that matches the task (in the primary query), if there is more than one row with the same task value, it will return the row with the greatest id value.

    SELECT * FROM `mytable` p1 
    where id = (select id from `mytable` p2 
                   where p2.task= p1.task
                    order by id DESC limit 1); 
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search