Finding the maximum difference between columns from different rows - Postgresql

Patrick
February 28, 2023
191 views
3 votes
5 Answers

Given the following data:

Course_ID	Teacher_ID	Minimum_grade	Maximum_grade
1	1	7	8
2	2	5	6
3	2	5	7
4	3	5	8
5	3	6	7
6	4	5	6
7	4	4	6
8	4	5	7

using Postgresql, I need to write a query to:

select all teachers who taught 2 separate courses X and Y where
the difference between the minimum grade of course X and maximum grade
of course Y is greater than 2.

Thus, only teacher 4 should be selected.

I tried the following:

SELECT Teacher_ID, MAX(Maximum_grade), MIN(Minimum_grade)
FROM dataset
GROUP BY Teacher_ID
HAVING count(Teacher_ID) > 1 AND (MAX(Maximum_grade) - Min(Minimum_grade)) > 2;

This selects teachers 3 and 4 which should not be the case.

I assume this is because the code I wrote compares the minimum grade of course 4 with the maximum grade of the same course, which should not happen.

I have no idea how to improve my code to obtain the desired results, any help would be much appreciated.

Answers

- BurakAltnk
- February 28, 2023 at 3:05 pm
- 0 votes
0
```
SELECT *
FROM (
    SELECT Teacher_ID , max(diff) as DIFF, COUNT(Course_ID) as NUMBER_OF_COURSE
    FROM (
        SELECT *, (Maximum_grade  - Minimum_grade) AS diff
        FROM dataset
    )
    GROUP BY Teacher_ID
)
WHERE DIFF > 2
```
result = TEACHER_ID 3 , DIFF 3 , NUMBER_OF_COURSE 2
Login or Signup to reply.

USING PYTHON

DF:

import pandas as pd
df = pd.DataFrame({'CID': [1,2,3,4,5,6,7,8,8],
                   'TID': [1,2,2,3,3,4,4,4,4],
                   'Min': [7,5,5,5,6,5,4,5,6],
                   'Max': [8,6,7,8,7,6,6,7,7]})

Code:

ans = []   #Creating new list where we will store our result

#Creating new list with the teachers who attend two or more thn two courses
TID = [k for k, v in df.groupby(['TID'])['CID'].count().to_dict().items() if v>=2]



for T in TID:    #Loop Over Teachers

    #Lets just select the Min value row
    d = df.loc[df.loc[df['TID']==T][['Min']].idxmin().values[0]]

    #Lets add Max to row, while filtering just not select the Min CID here
    d['Max'] = max(df.loc[(df['TID']==T) & (df['CID']!=d['CID'])]['Max'].tolist())


    if(d['Max'] - d['Min']> 2):
         ans.append({T:  d['Max'] - d['Min']})
    
ans

Output:

[{4: 3}]  ### CID : DIff

- JonSG
- February 28, 2023 at 3:58 pm
- 0 votes
0
I would solve this by selecting from your table twice, once with an eye to find the minimum grade and the second time to find the maximum. When we join those together we want to do so for teachers who share their id but making sure that the classes have different ids:
```
SELECT
    a.Teacher_ID,
    MAX(b.Maximum_grade - a.Minimum_grade) AS diff
FROM
    dataset as a JOIN
    dataset as b ON
    (
       a.Teacher_ID = b.Teacher_ID AND
       a.Course_ID != b.Course_ID
    )
GROUP BY
    a.Teacher_ID
HAVING
    MAX(b.Maximum_grade - a.Minimum_grade) > 2
```
You might try it here, though I don’t know how long this sites keep these fiddles around: http://sqlfiddle.com/#!9/1c4a72/1
Login or Signup to reply.

- vicki
- February 28, 2023 at 4:06 pm
- 0 votes
0
This query can still be optimised (or) converted into sub-queries. This is from the top my head. This consists of two parts, first we need to satisfy the two conditions,
1. All teachers who taught 2 separate courses X and Y
2. Diff b/w the minimum grade of course X and maximum grade of course Y is greater than 2.
```
select teacher_id, max(max_g) as max_g, min(min_g) as min_g from dataset 
group by teacher_id 
having count(distinct course_id) >= 2 and (max(max_g)- min(min_g)) > 2
```
The above query fetches the records for the above condition, with just a little contradiction. The contradiction is, sometimes the difference b/w is calculated within the same course, ie

max(max_grade of course X) – min(min_grade of course X)

To correct this, I’m just selecting records where max(max_grade) and min(min_grade) are from different rows( hence that courses will also be different).
```
with diff_gt_two as
(   
   select 
        teacher_id, 
        max(max_g) as max_g, min(min_g) as min_g 
   from dataset     
   group by teacher_id 
   having count(distinct course_id) >= 2 and 
   (max(max_g)- min(min_g)) > 2
)
select dataset.teacher_id from dataset,diff_gt_two
where 
     dataset.teacher_id = diff_gt_two.teacher_id 
     and 
     ( dataset.max_g = diff_gt_two.max_g or     dataset.min_g = 
       diff_gt_two.min_g) 
group by dataset.teacher_id
having count(*) > 1
```
Edit:
@JonSG has converted the CTE to subquery, fiddle link: http://sqlfiddle.com/#!9/1c4a72/9
. Thanks JonSG
Login or Signup to reply.

- AlainT
- February 28, 2023 at 5:12 pm
- 0 votes
0
You can join the dataset with itself intersecting on same teacher with different courses and get the maximum differences per teacher from this cartesian product:
```
SELECT A.Teacher_ID, max(A.Maximum_grade-B.Minimum_grade) as Diff
FROM dataset A,dataset B 
WHERE A.Teacher_ID = B.teacher_ID and A.Course_ID <> B.Course_ID
GROUP BY A.Teacher_ID
HAVING Diff > 2;
```
Because of the join’s condition requiring different courses, teachers with only one course will not come out and differences of grades within the same course will be excluded
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Finding the maximum difference between columns from different rows – Postgresql

Answers