skip to Main Content

Having the following table

mysql> desc customer;
+------------------+--------------+------+-----+---------+-------+
| Field            | Type         | Null | Key | Default | Extra |
+------------------+--------------+------+-----+---------+-------+
| id               | char(36)     | NO   | PRI | NULL    |       |
| firstname        | varchar(255) | YES  |     | NULL    |       |
| date_entered     | datetime     | YES  |     | NULL    |       |
| date_modified    | datetime     | YES  |     | NULL    |       |
| modified_user_id | char(36)     | YES  |     | NULL    |       |
| created_by       | char(36)     | YES  |     | NULL    |       |
| description      | text         | YES  |     | NULL    |       |
| deleted          | tinyint(1)   | YES  |     | 0       |       |
| assigned_user_id | char(36)     | YES  |     | NULL    |       |
| lastname         | varchar(255) | YES  |     | NULL    |       |
| email            | varchar(255) | YES  |     | NULL    |       |
| pc               | varchar(255) | YES  |     | NULL    |       |
| country          | varchar(100) | YES  |     | NULL    |       |
| telephone        | varchar(255) | YES  |     | NULL    |       |
| dob              | date         | YES  |     | NULL    |       |
+------------------+--------------+------+-----+---------+-------+

And a set of duplicate records similar to the following

mysql> SELECT email, COUNT(*) count FROM customer GROUP BY email HAVING count > 1 ORDER BY count DESC limit 10;
+-----------------------------+-------+
| email                       | count |
+-----------------------------+-------+
| [email protected]           |     2 |
| [email protected]      |     2 |
| [email protected]    |     2 |
| [email protected]           |     2 |
| [email protected] |     2 |
| [email protected]         |     2 |
| [email protected]            |     2 |
| [email protected]          |     2 |
| [email protected]       |     2 |
| [email protected]      |     2 |
+-----------------------------+-------+

For example

mysql> select firstname, lastname, pc, country, telephone, dob, date_entered, deleted from customer where email = "[email protected]"
+-----------------------+----------+------------+---------+----------+-----------+----------+------+---------+
| firstname             | lastname | pc         | country | telephone | dob  | date_entered        | deleted |
+-----------------------+----------+------------+---------+-----------+------+---------------------+---------+
| ANTONIO CHICA GALLEGO |          |            | ES      | NULL      | NULL | 2023-03-24 21:50:15 |       0 |
| ANTONIO CHICA GALLEGO |          |            | ES      |           | NULL | 2023-04-04 09:42:11 |       0 |
+-----------------------+----------+------------+---------+-----------+-----------+----------+------+--------+

I need to remove one of these duplicate records, so that only one remains of each duplicate record

How to remove one of the duplicate records?

Thanks in advance

2

Answers


  1. DELETE FROM customer 
    WHERE id NOT IN (
        SELECT id FROM (
            SELECT id FROM customer 
            WHERE email = "[email protected]"
            ORDER BY date_entered DESC 
            LIMIT 1
        ) temp
    )
    AND email = "[email protected]";
    
    • deleting all duplicate records for email "[email protected]" except latest date_entered
    • modify based on requirements

    edit:

    DELETE c1
    FROM customer c1, customer c2
    WHERE c1.email = c2.email
    AND c1.date_entered < c2.date_entered
    
    • deletes one of the duplicate records for each email address except latest date_entered
    • detail: join the customer table with itself(creates alias c1, c2) — compare email addresses and date_entered — WHERE specifies (the records selected for deletion are those with the same email address + older date_entered) — selecting c1 in DELETE removes the older record
    Login or Signup to reply.
  2. You could try to use this query

    UPDATE customer c1 SET c1.deleted = 1
    WHERE  EXISTS (
           SELECT 1
           FROM  (SELECT 1
                  FROM   customer c2
                  WHERE  c2.email = c1.email AND c2.id < c1.id) c3)
    

    That means per each record, if there are existed other records that have the same email and smaller id, then update value of its deleted column to 1.

    This query will update value of deleted column of all records that have the same email, except the one will smallest id.

    In case that you actually want to delete records, just replace UPDATE with DELETE statement.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search