skip to Main Content

I have tables named course, student and students_in in a MySQL database.

The tables look like this:

course

  course_id       name
    3             Physics
   12             English
   19             Basket Weaving
    4             Computer Science
  212             Discrete Math
  102             Biology
   20             Chemistry
   50             Robotics
    7             Data Engineering

student

id    name
 2    Sally
 1    Bob
17    Robert
 9    Pierre
12    Sydney
41    James
22    William
 5    Mary
 3    Robert
92    Doris
 6    Harry

students_in

  course_id   student_id      grade
    3              2              B
  212              2              A
    3             12              A
   19             12              C
    3             41              A
    4             41              B
  212             41              F
   19             41              A
   12             41              B
    3             17              C
    4              1              A
  102              1              D
  102             22              A
   20             22              A
   20              5              B
   50              3              A
   12             92              B
   12             17              C
    7              6              A

       

Here is a Fiddle: http://sqlfiddle.com/#!9/d3f991/1

My goal is to get the id and name of the students who:

  1. have taken a course with Sally (i.e. "first-degree" relationship), OR
  2. have taken a course with someone who has taken a course with Sally (i.e. "second-degree" relationship), OR
  3. have taken a course with someone who has taken a course with someone who has taken a course with Sally (i.e. "third-relationship" relationship)

Essentially, we’re looking for first-, second-, and third-degree relationships to Sally.

Here is a depiction of what this looks like logically:

enter image description here

Since Sally took course IDs 3 and 212, the desired result would look like this (not the colorful table above, which I provided for illustration of the logic involved):

student_id       student_name
    12           Sydney             <-- took course ID 3 with Sally
    41           James              <-- took course ID 3 and 212 with Sally
    17           Robert             <-- took course ID 3 with Sally
     1           Bob                <-- took course ID 4 with James
    92           Doris              <-- took course ID 12 with James and Robert
   102           William            <-- took course ID 102 with Bob

Other than using a recursive CTE, is it possible to get the desired output using a more simplified approach, such as one or more of the following?

  1. JOINS and subqueries
  2. using the ANY or IN operators

Thank you!

3

Answers


  1. You can union the separate degrees and group/order the resulting names:

    select `name` from (
        (
        select student.`name`
        from student sally
        join students_in sally_course
        on sally.id = sally_course.student_id
        join students_in si1
        on sally_course.course_id = si1.course_id
        join student
        on si1.student_id = student.id
        where student.`name` <> 'sally'
        )
        union
        (
        select student.`name`
        from student sally
        join students_in sally_course
        on sally.id = sally_course.student_id
        join students_in si1
        on sally_course.course_id = si1.course_id
        join students_in si2
        on si1.course_id = si2.course_id
        join student
        on si2.student_id = student.id
        where student.`name` <> 'sally'
        )
        union
        (
        select student.`name`
        from student sally
        join students_in sally_course
        on sally.id = sally_course.student_id
        join students_in si1
        on sally_course.course_id = si1.course_id
        join students_in si2
        on si1.course_id = si2.course_id
        join students_in si3
        on si2.course_id = si3.course_id
        join student
        on si3.student_id = student.id
        where student.`name` <> 'sally'
        )) t
        group by `name`
        order by `name`
    
    Login or Signup to reply.
  2. For completeness, I’d add the recursive approach following with the two steps:

    • base step: retrieves all courses which Sally participated to
    • recursive step: retrieves all courses of students that have attended courses obtained at (n-1)th step

    You can generalize how deep you want to go in the filtering clause of the recursive step (WHERE recursion_depth + 1 <= <N>).

    WITH RECURSIVE cte AS (
        SELECT s.student_id, s.course_id, 0 AS recursion_depth
        FROM       students_in s
        INNER JOIN student ON s.student_id = student.id_ 
        WHERE student.name_ = 'Sally'
      
        UNION ALL
       
        SELECT s2.student_id, s2.course_id, recursion_depth+1 
        FROM       cte
        INNER JOIN students_in s1 ON s1.course_id = cte.course_id
        INNER JOIN students_in s2 ON s1.student_id = s2.student_id
        WHERE recursion_depth + 1 <= 3
    )
    SELECT DISTINCT student.*
    FROM       cte
    INNER JOIN student ON cte.student_id = student.id_
    WHERE NOT student.name_ = 'Sally'
    

    Check the demo here.

    Note: looking for a non-recursive query would translate to manually apply base step + recursive step number of join operations, multiplied by the depth value, thing that the DBMS would do with a recursive query in a more natural and clean way.

    Login or Signup to reply.
  3. Use CTEs for each of the levels with the operator IN.

    Finally use UNION to get all the student ids (without duplicates) and again with the operator IN get the students details from student:

    WITH 
      cte0 AS (SELECT id FROM student WHERE name = 'Sally'),
      cte1 AS (SELECT student_id FROM students_in WHERE course_id IN (SELECT course_id FROM students_in WHERE student_id = (SELECT id FROM cte0))),
      cte2 AS (SELECT student_id FROM students_in WHERE course_id IN (SELECT course_id FROM students_in WHERE student_id IN (SELECT student_id FROM cte1))),
      cte3 AS (SELECT student_id FROM students_in WHERE course_id IN (SELECT course_id FROM students_in WHERE student_id IN (SELECT student_id FROM cte2))),
      cte AS (SELECT student_id FROM cte1 UNION SELECT student_id FROM cte2 UNION SELECT student_id FROM cte3)
    SELECT *
    FROM student 
    WHERE id IN (SELECT student_id FROM cte) AND id <> (SELECT id FROM cte0);
    

    See the demo.

    If you want to use this solution just for comparison to a recursive query then fine, but it is obvious that this is not the proper way to do it.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search