MySql remove rows which have duplicate column content

Ivan
November 2, 2022
79 views
0 votes
2 Answers

I get database table which contain postal numbers and regions for my country. That table have all information but i need to change it for my purpose.

I need to eliminate all rows that have duplicate content in specific column.

Check screenshot to see result

I want to remove all duplicate rows which have postanski_broj (postal_number) the some. That number need to be unique. I try manualy to set that column to unique but i get duplicate entry when i try to execute statment.

ID is primary key with auto increment.
postanski_broj column is VARCHAR which represent postal_code
naselje column is VARCHAR which represent region

One region can have one postal_code

I try

ALTER  TABLE poste ADD UNIQUE INDEX idx_postanski_br (postanski_broj);

00:03:20 ALTER TABLE poste ADD UNIQUE INDEX idx_postanski_br
(postanski_broj) Error Code: 1062. Duplicate entry ‘11158’ for key
‘idx_postanski_br’ 0.118 sec

ALTER IGNORE TABLE poste ADD UNIQUE INDEX idx_postanski_br (postanski_broj);

00:04:17 ALTER IGNORE TABLE poste ADD UNIQUE INDEX idx_postanski_br
(postanski_broj) Error Code: 1064. You have an error in your SQL
syntax; check the manual that corresponds to your MySQL server version
for the right syntax to use near ‘IGNORE TABLE poste ADD UNIQUE INDEX
idx_postanski_br (postanski_broj)’ at line 1 0.00037 sec

Anyone have sugestion? Thanks

Tags: mysql

Answers

- JakyRuby
- November 2, 2022 at 12:23 am
- 0 votes
0
This solution can take too much time for big tables. Best way of solving this is: Remove duplicate rows in MySQL

You have to delete the rows before applying the unique constraint. Be careful applying this:
```
DELETE p1 FROM poste p1
INNER JOIN poste p2 
WHERE 
    p1.id < p2.id AND 
    p1.postanski_broj = p2.postanski_broj;
```
This should remove the duplicated ones and will keep only the ones with the higher id (id=168044 in your example).
Login or Signup to reply.

- FanoFN
- November 2, 2022 at 12:46 am
- 0 votes
0
If you have other columns with different values than the ones you’ve shown there (except for id), deleting should be your last choice.

I usually would duplicate the table first:
```
CREATE TABLE poste_new LIKE poste;
```
add unique index to the newly created poste_new table:
```
ALTER  TABLE poste_new ADD UNIQUE INDEX idx_postanski_br (postanski_broj);
```
insert the data from poste into poste_new with IGNORE option to skip duplicates based on the unique index:
```
INSERT IGNORE INTO poste_new SELECT * FROM poste;
```
rename the tables:
```
RENAME TABLE poste TO poste_old;
RENAME TABLE poste_new TO poste;
```
The good thing about this is that you’ve minimized the risk of wrong delete and if you’re not satisfied with the new table, you still have the old table intact – effectively making it a backup.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.