skip to Main Content

Given the following data:

Course_ID Teacher_ID Minimum_grade Maximum_grade
1 1 7 8
2 2 5 6
3 2 5 7
4 3 5 8
5 3 6 7
6 4 5 6
7 4 4 6
8 4 5 7

using Postgresql, I need to write a query to:

select all teachers who taught 2 separate courses X and Y where
the difference between the minimum grade of course X and maximum grade
of course Y is greater than 2.

Thus, only teacher 4 should be selected.

I tried the following:

SELECT Teacher_ID, MAX(Maximum_grade), MIN(Minimum_grade)
FROM dataset
GROUP BY Teacher_ID
HAVING count(Teacher_ID) > 1 AND (MAX(Maximum_grade) - Min(Minimum_grade)) > 2;

This selects teachers 3 and 4 which should not be the case.

I assume this is because the code I wrote compares the minimum grade of course 4 with the maximum grade of the same course, which should not happen.

I have no idea how to improve my code to obtain the desired results, any help would be much appreciated.

5

Answers


  1. SELECT *
    FROM (
        SELECT Teacher_ID , max(diff) as DIFF, COUNT(Course_ID) as NUMBER_OF_COURSE
        FROM (
            SELECT *, (Maximum_grade  - Minimum_grade) AS diff
            FROM dataset
        )
        GROUP BY Teacher_ID
    )
    WHERE DIFF > 2
    

    result = TEACHER_ID 3 , DIFF 3 , NUMBER_OF_COURSE 2

    Login or Signup to reply.
  2. USING PYTHON

    DF:

    import pandas as pd
    df = pd.DataFrame({'CID': [1,2,3,4,5,6,7,8,8],
                       'TID': [1,2,2,3,3,4,4,4,4],
                       'Min': [7,5,5,5,6,5,4,5,6],
                       'Max': [8,6,7,8,7,6,6,7,7]})
    

    Code:

    ans = []   #Creating new list where we will store our result
    
    #Creating new list with the teachers who attend two or more thn two courses
    TID = [k for k, v in df.groupby(['TID'])['CID'].count().to_dict().items() if v>=2]
    
    
    
    for T in TID:    #Loop Over Teachers
    
        #Lets just select the Min value row
        d = df.loc[df.loc[df['TID']==T][['Min']].idxmin().values[0]]
    
        #Lets add Max to row, while filtering just not select the Min CID here
        d['Max'] = max(df.loc[(df['TID']==T) & (df['CID']!=d['CID'])]['Max'].tolist())
    
    
        if(d['Max'] - d['Min']> 2):
             ans.append({T:  d['Max'] - d['Min']})
        
    ans
    

    Output:

    [{4: 3}]  ### CID : DIff
    
    Login or Signup to reply.
  3. I would solve this by selecting from your table twice, once with an eye to find the minimum grade and the second time to find the maximum. When we join those together we want to do so for teachers who share their id but making sure that the classes have different ids:

    SELECT
        a.Teacher_ID,
        MAX(b.Maximum_grade - a.Minimum_grade) AS diff
    FROM
        dataset as a JOIN
        dataset as b ON
        (
           a.Teacher_ID = b.Teacher_ID AND
           a.Course_ID != b.Course_ID
        )
    GROUP BY
        a.Teacher_ID
    HAVING
        MAX(b.Maximum_grade - a.Minimum_grade) > 2
    

    You might try it here, though I don’t know how long this sites keep these fiddles around: http://sqlfiddle.com/#!9/1c4a72/1

    Login or Signup to reply.
  4. This query can still be optimised (or) converted into sub-queries. This is from the top my head. This consists of two parts, first we need to satisfy the two conditions,

    1. All teachers who taught 2 separate courses X and Y
    2. Diff b/w the minimum grade of course X and maximum grade of course Y is greater than 2.
    select teacher_id, max(max_g) as max_g, min(min_g) as min_g from dataset 
    group by teacher_id 
    having count(distinct course_id) >= 2 and (max(max_g)- min(min_g)) > 2
    

    The above query fetches the records for the above condition, with just a little contradiction. The contradiction is, sometimes the difference b/w is calculated within the same course, ie

    max(max_grade of course X) – min(min_grade of course X)

    To correct this, I’m just selecting records where max(max_grade) and min(min_grade) are from different rows( hence that courses will also be different).

    with diff_gt_two as
    (   
       select 
            teacher_id, 
            max(max_g) as max_g, min(min_g) as min_g 
       from dataset     
       group by teacher_id 
       having count(distinct course_id) >= 2 and 
       (max(max_g)- min(min_g)) > 2
    )
    select dataset.teacher_id from dataset,diff_gt_two
    where 
         dataset.teacher_id = diff_gt_two.teacher_id 
         and 
         ( dataset.max_g = diff_gt_two.max_g or     dataset.min_g = 
           diff_gt_two.min_g) 
    group by dataset.teacher_id
    having count(*) > 1
    

    Edit:
    @JonSG has converted the CTE to subquery, fiddle link: http://sqlfiddle.com/#!9/1c4a72/9
    . Thanks JonSG

    Login or Signup to reply.
  5. You can join the dataset with itself intersecting on same teacher with different courses and get the maximum differences per teacher from this cartesian product:

    SELECT A.Teacher_ID, max(A.Maximum_grade-B.Minimum_grade) as Diff
    FROM dataset A,dataset B 
    WHERE A.Teacher_ID = B.teacher_ID and A.Course_ID <> B.Course_ID
    GROUP BY A.Teacher_ID
    HAVING Diff > 2;
    

    Because of the join’s condition requiring different courses, teachers with only one course will not come out and differences of grades within the same course will be excluded

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search