skip to Main Content

I’m having duplicate values in a table which I’m able to fetch using the below sql –

select p.*, j.job_id from (
    select count(job_id) cnt, title, email, msg_no from (
        select job_id, title, email, msg_no from job_posts order by job_id desc limit 500) as t 
    group by title, email, msg_no ) p 
left join job_posts j on p.title = j.title and p.email = j.email 
    and p.msg_no = j.msg_no where p.cnt >1

Now the result it’s showing is something like –

cnt  |  title          |    email         | msg_no | job_id
2    | some title      | [email protected] |  123   | 210
2    | some title      | [email protected] |  123   | 209
2    | some title      | [email protected] |  123   | 208
2    | another title   | [email protected]   |  243   | 329
2    | another title   | [email protected]   |  243   | 328

Now I want to delete only the rows having job_id less than the max value for a unique row

i.e i want to delete these rows –

cnt  |  title          |    email         | msg_no | job_id
2    | some title      | [email protected] |  123   | 209
2    | some title      | [email protected] |  123   | 208
2    | another title   | [email protected]   |  243   | 328

2

Answers


  1. Pattern:

    DELETE t1.*
    FROM table t1
    JOIN table t2 USING ( {common columns} )
    WHERE t1.criteria_column < t2.criteria_column
    
    Login or Signup to reply.
  2. With sample data like …

    Create Table IF NOT EXISTS job_posts (job_id Int, 
                                          title VarChar(16), 
                                          email VarChar(32), 
                                          msg_no Int);
                                          
    Insert Into job_posts (job_id, title, email, msg_no) VALUES
    (105, 'single title X', '[email protected]', 103),
    (104, 'single title Y', '[email protected]', 103),
      
    (210, 'some title', '[email protected]', 123), 
    (209, 'some title', '[email protected]', 123), 
    (208, 'some title', '[email protected]', 123),
    (329, 'another title', '[email protected]', 243), 
    (328, 'another title', '[email protected]', 243);
    
    Select * From job_posts Order By job_id;
    /*
    job_id  title               email              msg_no
    ------  ------------------  -----------------  ------
       104  single title Y      [email protected]       103
       105  single title X      [email protected]       103
       208  some title          [email protected]      123
       209  some title          [email protected]      123
       210  some title          [email protected]      123
       328  another title       [email protected]        243
       329  another title       [email protected]        243  */
    

    … one option to isolate repeating data rows is to use Count() Over() analytic function …

    SELECT x.job_id
    FROM 
      ( Select job_id, 
               Count(job_id) Over(Partition By title, email, msg_no Order By job_id Desc
                              Rows Between Unbounded Preceding And Current Row) as row_nmbr,
               Count(job_id) Over(Partition By title, email, msg_no) as num_of_rows
        From   job_posts
      ) x
    WHERE num_of_rows > 1 And row_nmbr > 1
    /*
    job_id
    ------
       328
       209
       208     */
    

    … the above code selects job_ids that you want to delete, so use it in Where clause to filter your delete action.

    DELETE FROM job_posts
    WHERE job_id IN( SELECT x.job_id
                     FROM 
                       ( Select job_id, 
                                Count(job_id) Over(Partition By title, email, msg_no Order By job_id Desc
                                                   Rows Between Unbounded Preceding And Current Row) as row_nmbr,
                                Count(job_id) Over(Partition By title, email, msg_no) as num_of_rows
                         From   job_posts
                       ) x
                     WHERE num_of_rows > 1 And row_nmbr > 1
                   );
    

    … checking the table content after delete …

    Select * From job_posts;
    /*
    job_id  title               email              msg_no
    ------  ------------------  -----------------  ------
       105  single title X      [email protected]       103
       104  single title Y      [email protected]       103
       210  some title          [email protected]      123
       329  another title       [email protected]        243  */
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search