Assume a table salaries that contains employee_id, department_id and salary, as follows.
e_id | dpt_id | salary
---- | ------ | ------
1 | 2 | 1000
2 | 2 | 2000
3 | 5 | 3000
4 | 6 | 500
5 | 5 | 100
6 | 1 | 1050
I wanted to bring the dpt with the maximum/minimum salaries and I used the following query which failed:
select sum_sal_per_dpt.dpt_id, max(sum_sal_per_dpt.total_salaries)
from (
select dpt_id, sum(salary) as total_salaries
from salaries
group by dpt_id ) as sum_sal_per_dpt
group by sum_sal_per_dpt.dpt_id
union
select sum_sal_per_dpt.dpt_id, min(sum_sal_per_dpt.total_salaries)
from (
select dpt_id, sum(salary) as total_salaries
from salaries
group by dpt_id ) as sum_sal_per_dpt
group by sum_sal_per_dpt.dpt_id
The output I get is a table with the dpt_id and the summation of salaries, as if the subquery is only executed:
dpt_id | salary
------ | ------
2 | 3000
5 | 3100
6 | 500
1 | 1050
instead of
dpt_id | salary
------ | ------
5 | 3100
6 | 500
Same if I write it in a CTE fashion.
with
sum_sal_per_dpt as (select dpt_id, sum(salary) as total_salaries
from salaries
group by dpt_id)
select sum_sal_per_dpt.dpt_id, max(sum_sal_per_dpt.total_salaries)
from sum_sal_per_dpt
group by sum_sal_per_dpt.dpt_id
union
select sum_sal_per_dpt.dpt_id, min(sum_sal_per_dpt.total_salaries)
from sum_sal_per_dpt
group by sum_sal_per_dpt.dpt_id
Can someone explain?
2
Answers
The following should give you the results you expect:
Note that the above code will give you multiple rows if you have multiple max and/or min values, you can easily adjust this depending on what you want to see.
Within your code, you were using group by
dpt_id
, which is why you were still getting max values perdpt_id
, hence multiple rows, additional confusion might have been created with usingunion
instead ofunion all
, whicEssentialy the same answer as from Boris Jovanovic, just without an union.
Result:
|dpt_id|total_salaries|
|—|—|
|5|3100|
|6|500|
Example on DB-Fiddle.uk
Why your inital queries don’t work:
You always fall back to using "GROUP BY" and instruct the DBMS to aggregate your results by the dpt_id.
So it will never sort out any department id.
Max(XYZ) isn’t a filter/aggregate function to only give you the maximal value of a column and all corresponding fields to this maximum value.
It will give you the maximal value for every distinct row – whereby distinction is defined by the columns you refer to in der GROUP BY clause
If you change the first query to just give out the Max/Min Values, you can see that the values are already there – just unusable.
I just then applied the solution for this problem from the aggergate tutorial from the official postgres documentation.
Postgres Documentation for aggregation