I have a dogs
table that looks like this:
create table dogs (name char(10), breed, char(20), age int);
insert into dogs (name, breed, age) values ('A', 'Breed 1', 1);
insert into dogs (name, breed, age) values ('B', 'Breed 2', 2);
insert into dogs (name, breed, age) values ('C', 'Breed 3', 3);
insert into dogs (name, breed, age) values ('D', 'Breed 1', 4);
insert into dogs (name, breed, age) values ('E', 'Breed 2', 2);
insert into dogs (name, breed, age) values ('F', 'Breed 3', 5);
insert into dogs (name, breed, age) values ('G', 'Breed 1', 1);
I need to select names of dogs for which their respective breeds and ages are equal, so the output should be the following:
name1 name2
A G
B E
because dogs A and G have the same breed (Breed 1) and age (1), and dogs B and E have the same breed (Breed 2) and age (2).
I know how to find out the names of the dogs that should be in the output:
select name from dogs where (breed, age) in (select breed, age from (select breed, age, count(*) as n from dogs group by breed, age having n > 1) as a);
However I’m lost as to how to convert it to a desired output with two columns.
3
Answers
That is usually done with a self join:
the d1.name>d2.name serves two purposes: 1. to prevent the row to match itself, and 2. to make sure that we get only one row per pair (not (A,G) and (G,A), for example).
Use
GROUP_CONCAT()
to distinguish diffrent namesthe result is
then use
SUBSTRING_INDEX
andsubquery
to change above result to columnsdbfiddle