I’m learning SQL and have come across a problem that needs a self-join. I’ve been playing around with the query to understand why the self-join table lays it out the way it does, but haven’t had much luck
Create table Departments (DepartmentID int primary key, Name text);
insert into Departments values
(1001,'SRO'),
(2001,'Drs'),
(3001,'Accounting');
Create table Employees
(
EmployeeID int not null,
DepartmentID int not null,
ManagerID int,
Name varchar(50) not null,
Salary int not null,
primary key(EmployeeID),
foreign key (DepartmentID)
references Departments(DepartmentID)
);
insert into Employees values
(68319,1001,NULL,'Kayling','6000.00'),
(66928,3001,68319,'Blaze','2750.00'),
(67832,1001,68319,'Clare','2550.00'),
(65646,2001,68319,'Jonas','2957.00'),
(67858,2001,65646,'Scarlet','3100.00'),
(69062,2001,65646,'Frank','3100.00'),
(63679,2001,69062,'Sandrine','900.00'),
(64989,3001,66928,'Adelyn','1700.00'),
(65271,3001,66928,'Wade','1350.00');
Now perform two queries to understand what the table was like originally and what it became after the self-join
Query 1
select * from Employees order by Employeeid;
Query 2
select * from Employees e1
join Employees e2 on
e2.ManagerID = e1.EmployeeID
order by e1.employeeid;
The results I get are:
Query 1
EmployeeID DepartmentID ManagerID Name Salary
63679 2001 69062 Sandrine 900
64989 3001 66928 Adelyn 1700
65271 3001 66928 Wade 1350
65646 2001 68319 Jonas 2957
66928 3001 68319 Blaze 2750
67832 1001 68319 Clare 2550
67858 2001 65646 Scarlet 3100
68319 1001 null Kayling 6000
69062 2001 65646 Frank 3100
Query 2 Results
EmployeeID DepartmentID ManagerID Name Salary EmployeeID DepartmentID ManagerID Name Salary
65646 2001 68319 Jonas 2957 69062 2001 65646 Frank 3100
65646 2001 68319 Jonas 2957 67858 2001 65646 Scarlet 3100
66928 3001 68319 Blaze 2750 65271 3001 66928 Wade 1350
66928 3001 68319 Blaze 2750 64989 3001 66928 Adelyn 1700
68319 1001 null Kayling 6000 67832 1001 68319 Clare 2550
68319 1001 null Kayling 6000 66928 3001 68319 Blaze 2750
68319 1001 null Kayling 6000 65646 2001 68319 Jonas 2957
69062 2001 65646 Frank 3100 63679 2001 69062 Sandrine 900
Now my understanding of query two is on the first column only show the managerID and not anyone else (e.g staff only).
This type of self join should allow me to find all the employees (staff) who are managed by each manager. How does a new join table remember that the first salary column relates to the manager while last relates to the employee. Is there no intelligence in the table but rather in the query e.g. e1 and e1?
2
Answers
SELECT *
(as opposed to usinge1.*
or individual column names as ine1.EmployeeID
) just displays all columns one after the other. Your table has 5 columns, so you get all 5 fromFROM
followed by all 5 from fromJOIN
.You can always be more explicit:
Your query list managers first and their related employees, you need to inverse the condition :
left join
can be used to also list employees without managers.Demo here