Mysql - Correlated Subquery appears to use circular logic

JosephErickson
November 9, 2023
245 views
0 votes
2 Answers

I googled correlated subquery, and it gave me the below example. However, it appears in the outer query, that department id will be determined by a dynamic salary, which in turn, is determined by a relationship between department_id in the outer query and department_id in the inner query. This seems like very circular logic. How can both department_ids have a relationship when salary is not yet determined?

Tags: correlated-subquery mysql

Answers

- ysth
- November 9, 2023 at 6:27 pm
- 0 votes
0
You are confusing the order of operations. The correlated subquery is essentially a join; in principle it will be performed for all rows (though in practice there is often some optimization) before the where conditions are applied. And joins (again, in principal) execute first and find all the combined source rows that then are filtered by where conditions, then grouped by group bys, etc.

Essentially that’s what a correlated subquery is: a subquery that doesn’t resolve without information from the outer rows.

Does the equivalent
```
select e.salary, e.department_id
from employees e
join (
    select department_id, avg(salary) avg_salary
    from employees
    group by department_id
) department_average
    using (department_id)
where e.salary > department_average.salary
```
also confuse you?
Login or Signup to reply.

- BillKarwin
- November 9, 2023 at 6:30 pm
- 0 votes
0
Think of this pseudocode:
```
for each employee 'E'
do
  select all the employees in the same department.
  calculate the average salary of those employees.
  if the salary for employee 'E' is greater than the resulting average, 
    then list that employee, 
    otherwise, skip that employee.
done
```
So the subquery calculates the average salary of all employees in the same department as the employee being examined in the outer query. To do this, it needs to filter by the outer.department_id.

The example is meant to demonstrate that in a correlated subquery, the subquery is not evaluated before the outer query. It must be evaluated repeatedly, once for each row of the outer query, because it will be compared to each respective employee’s salary in turn.

The example query actually has a stylistic mistake, in my opinion. The GROUP BY department_id is superfluous, because by definition the subquery matches only one department_id each time it runs. So there will only be one group in each evaluation of the subquery regardless.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Mysql – Correlated Subquery appears to use circular logic

Answers