I’m having duplicate values in a table which I’m able to fetch using the below sql –
select p.*, j.job_id from (
select count(job_id) cnt, title, email, msg_no from (
select job_id, title, email, msg_no from job_posts order by job_id desc limit 500) as t
group by title, email, msg_no ) p
left join job_posts j on p.title = j.title and p.email = j.email
and p.msg_no = j.msg_no where p.cnt >1
Now the result it’s showing is something like –
cnt | title | email | msg_no | job_id
2 | some title | [email protected] | 123 | 210
2 | some title | [email protected] | 123 | 209
2 | some title | [email protected] | 123 | 208
2 | another title | [email protected] | 243 | 329
2 | another title | [email protected] | 243 | 328
Now I want to delete only the rows having job_id less than the max value for a unique row
i.e i want to delete these rows –
cnt | title | email | msg_no | job_id
2 | some title | [email protected] | 123 | 209
2 | some title | [email protected] | 123 | 208
2 | another title | [email protected] | 243 | 328
2
Answers
Pattern:
With sample data like …
… one option to isolate repeating data rows is to use Count() Over() analytic function …
… the above code selects job_ids that you want to delete, so use it in Where clause to filter your delete action.
… checking the table content after delete …