skip to Main Content

I would like to write an SQL query to solve the below problem.

I have two tables:

          movies 
id        title       genre 
1         return     Comedy
2         Scooby     Documentary
3         Halo       Documentary



    reviews 
movide_id   rating 
    1        1
    1        3
    2        5
    2        5
    3        3

I want to return the best rated movie in each genre, and collect the average score of that movie. So for the above, the expected output would be:

genre         title    stars
Comedy        return    2
Documentary   Scooby    5

Halo is not there because it is a Documentary, and Scooby is also a Documentary and has higher ratings.

I also need the results to be sorted in ascending order of genre.

A few things I’m not sure on,

I know we use ORDER BY ... ASC to sort it.
But not sure about how to create the new columns and the other things.

3

Answers


  1. In logical query processing, window functions are processed after aggregate functions, and therefore aggregate functions can be used as parameters to window functions.

    select
      genre,
      title,
      round(stars, 2) stars
    from (
      select
        m.genre,
        m.title,
        avg(r.rating) stars,
        row_number() over(
          partition by m.genre
          order by avg(r.rating) desc
        ) rn
      from movies m
      join reviews r
      on r.movie_id = m.id
      group by
        m.genre,
        m.title
    ) s
    where rn = 1
    order by genre;
    

    Results:

    +-------------+--------+-------+
    |    genre    | title  | stars |
    +-------------+--------+-------+
    | Comedy      | return | 2.00  |
    | Documentary | Scooby | 5.00  |
    +-------------+--------+-------+
    

    Try it on db<>fiddle.

    Login or Signup to reply.
  2. You first need to pre-aggregate the reviews to get their average ratings, this can be a derived table which you join to movies; you can then rank the movies by stars and return only the top-ranked row for each genre from an outer query.

    Using ranking here means you will see both (or more) titles for a genre if they all happen to have the same average rating.

    select genre, title, stars
    from (
      select m.genre, m.title, stars,
        Dense_Rank() over(partition by genre order by stars desc) rnk
      from movies m
      join (
        select movide_id, Avg(rating) stars
        from reviews
        group by movide_id
      ) r on r.movide_id = m.id
    )t
    where rnk = 1
    order by stars;
    

    See a demo Fiddle

    Login or Signup to reply.
  3. AS an example of why it’s very important to specify your DBMS, this is how short it can be (using PostgreSQL’s DISTINCT ON to pick the best movie per genre)

    SELECT DISTINCT ON (m.genre)
      m.genre,
      m.title,
      ROUND(AVG(r.rating), 2)  AS stars
    FROM
      movies    AS m
    INNER JOIN
      reviews   AS r
        ON r.movie_id = m.id
    GROUP BY
      m.genre,
      m.title
    ORDER BY
      m.genre,
      AVG(r.rating) DESC
    
    genre title stars
    Comedy Return 2.00
    Documentary Scooby 5.00

    fiddle

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search