How do I remove pairs from a self-referencing table in MySQL?

MaryamGhafarinia
May 20, 2023
163 views
3 votes
5 Answers

I joined a table with it self and I have repeated pairs as I highlighted in the below image how to remove them?

select DISTINCT A.name as name1 , B.name as name2
from (select name , ratings.* from reviewers inner join ratings on reviewers.id = 
ratings.reviewer_id ) A ,
(select name , ratings.* from reviewers inner join ratings on reviewers.id = 
ratings.reviewer_id ) B
where A.reviewer_id <> B.reviewer_id 
and A.book_id = B.book_id
order by name1 , name2 ASC

name1	name2
Alice Lewis	Elizabeth Black
Chris Thomas	John Smith
Chris Thomas	Mike White
Elizabeth Black	Alice Lewis
Elizabeth Black	Jack Green
Jack Green	Elizabeth Black
Joe Martinez	Mike Anderson
John Smith	Chris Thomas
Mike Anderson	Joe Martinez
Mike White	Chris Thomas

Above table used to be an image

Tags: mysql sql

Answers

- based64
- May 20, 2023 at 11:41 am
- 0 votes
0
To remove the repeated pairs from your query result, you can use the GROUP BY clause to group the pairs by name1 and name2, and then select the minimum or maximum value from each group. Here’s an updated version of your query:
```
SELECT MIN(name1) as name1, MIN(name2) as name2
FROM (
  SELECT DISTINCT A.name as name1, B.name as name2
  FROM (
    SELECT name, ratings.*
    FROM reviewers
    INNER JOIN ratings ON reviewers.id = ratings.reviewer_id
  ) A,
  (
    SELECT name, ratings.*
    FROM reviewers
    INNER JOIN ratings ON reviewers.id = ratings.reviewer_id
  ) B
  WHERE A.reviewer_id <> B.reviewer_id
  AND A.book_id = B.book_id
) subquery
GROUP BY name1, name2
ORDER BY name1 ASC, name2 ASC;
```
In this modified query, I’ve wrapped your original query inside a subquery and used the GROUP BY clause on the outer query. By applying the MIN function on name1 and name2, we select the minimum values from each group, effectively removing the repeated pairs.

Note: It’s important to use an aggregate function (MIN, in this case) when using GROUP BY.
Login or Signup to reply.

- Stu
- May 20, 2023 at 11:52 am
- 0 votes
0
You can just do
```
select Name1, Name2
from ...
where Name1 < Name2;
```
See this example
Login or Signup to reply.

This can be done using greatest and least to identify duplicates accross columns :

This query will get duplicates :

select greatest(name, name2), least(name, name2)
from mytable
group by greatest(name, name2), least(name, name2)
having count(1) > 1

Result :

first_name  second_name
Jack Green  Elizabeth Black

Then :

DELETE t.* FROM mytable t
INNER JOIN (
  select greatest(name, name2) as first_name, least(name, name2) as second_name
  from mytable
  group by greatest(name, name2), least(name, name2)
  having count(1) > 1
) as s ON (t.name = s.first_name and t.name2 = s.second_name)
          OR (t.name2 = s.first_name and t.name = s.second_name)

Demo here

To remove the duplicates you first need to find them:

SELECT 
  LEAST(name1,name2) as L,
  GREATEST(name1,name2) as G
FROM names
GROUP BY LEAST(name1,name2), GREATEST(name1,name2) 
HAVING count(*)>1;

Then you can delete them:

WITH cte as (
  SELECT 
    LEAST(name1,name2) as L,
    GREATEST(name1,name2) as G
  FROM names
  GROUP BY LEAST(name1,name2), GREATEST(name1,name2) 
  HAVING count(*)>1
)
DELETE FROM names
WHERE (name2,name1) in (select * from cte);

see: DBFIDDLE

I’ve created the DDL and DML statements to reproduce your database and write a query retrieving unrepeated pairs. Here’s the building code which could be of help to others:

CREATE TABLE books (
  id INT PRIMARY KEY,
  title VARCHAR(100)
);

CREATE TABLE reviewers (
  id INT PRIMARY KEY,
  name VARCHAR(50)
);

CREATE TABLE ratings (
  id INT PRIMARY KEY,
  reviewer_id INT,
  book_id INT,
  rating INT,
  FOREIGN KEY (reviewer_id) REFERENCES reviewers(id),
  FOREIGN KEY (book_id) REFERENCES books(id)
);

-- Inserting sample records
INSERT INTO reviewers (id, name)
VALUES
  (1, 'Alice Lewis'),
  (2, 'Elizabeth Black'),
  (3, 'Chris Thomas'),
  (4, 'John Smith'),
  (5, 'Mike White'),
  (6, 'Jack Green'),
  (7, 'Joe Martinez'),
  (8, 'Mike Anderson');

INSERT INTO books (id, title)
VALUES
  (1, 'The Gulag Archipelago'),
  (2, 'One Day in the Life of Ivan Denisovich'),
  (3, 'Cancer Ward');

-- Insertion of rating records
INSERT INTO ratings (id, reviewer_id, book_id, rating)
VALUES
  (1, 1, 1, 4),
  (2, 1, 2, 3),
  (3, 2, 1, 5),
  (4, 2, 2, 4),
  (5, 2, 3, 2),
  (6, 3, 1, 3),
  (7, 3, 3, 4),
  (8, 4, 1, 2),
  (9, 4, 3, 3),
  (10, 5, 2, 5),
  (11, 6, 1, 1),
  (12, 6, 2, 3),
  (13, 6, 3, 4),
  (14, 7, 1, 3),
  (15, 7, 2, 4),
  (16, 8, 3, 2);

And here’s the refactored query:

SELECT DISTINCT
  A.name AS name1,
  B.name AS name2
FROM
  (
    SELECT
      reviewers.id,
      reviewers.name,
      ratings.book_id
    FROM
      reviewers
      INNER JOIN ratings ON reviewers.id = ratings.reviewer_id
  ) A
  JOIN (
    SELECT
      reviewers.id,
      reviewers.name,
      ratings.book_id
    FROM
      reviewers
      INNER JOIN ratings ON reviewers.id = ratings.reviewer_id
  ) B ON A.book_id = B.book_id
     AND A.id <> B.id
     AND A.name < B.name
ORDER BY
  name1,
  name2 ASC;

This is the output you’ll get from it:

+-----------------+-----------------+
|      name1      |      name2      |
+-----------------+-----------------+
| Alice Lewis     | Chris Thomas    |
| Alice Lewis     | Elizabeth Black |
| Alice Lewis     | Jack Green      |
| Alice Lewis     | Joe Martinez    |
| Alice Lewis     | John Smith      |
| Alice Lewis     | Mike White      |
| Chris Thomas    | Elizabeth Black |
| Chris Thomas    | Jack Green      |
| Chris Thomas    | Joe Martinez    |
| Chris Thomas    | John Smith      |
| Chris Thomas    | Mike Anderson   |
| Elizabeth Black | Jack Green      |
| Elizabeth Black | Joe Martinez    |
| Elizabeth Black | John Smith      |
| Elizabeth Black | Mike Anderson   |
| Elizabeth Black | Mike White      |
| Jack Green      | Joe Martinez    |
| Jack Green      | John Smith      |
| Jack Green      | Mike Anderson   |
| Jack Green      | Mike White      |
| Joe Martinez    | John Smith      |
| Joe Martinez    | Mike White      |
| John Smith      | Mike Anderson   |
+-----------------+-----------------+

Please signup or login to give your own answer.

Click here to cancel reply.