skip to Main Content

I have created an SQL query that returns me elements from a film table that satisfy one of two conditions. They are either the most profitable (gross – budget) or the least expensive movies to make.

However, i wanted to add a column to the return of the query that said ‘most profitable’ or ‘least expensive’ in function of whichever of the conditions the tuple satisfied. I tried looking online for a solution; and i decided to try using CASE WHEN in the SELECT part of the Query.

Original sql query:

SELECT DISTINCT Films.title, Films.year
FROM Films,
(
    SELECT DISTINCT (MAX(Films.gross-Films.budget))AS profit FROM Films
) AS Temp1
WHERE 
(
    (Films.gross-Films.budget)=Temp1.profit) 
    OR (Films.budget)=(SELECT DISTINCT Min(Films.budget) FROM Films)
)

CASE WHEN attempt:

SELECT DISTINCT 
    CASE WHEN Temp1.profit=Max(Temp1.profit) THEN 'most profitable' ELSE 'least expensive' END AS feature, 
    Films.title, 
    Films.year
FROM Films,
(
    SELECT DISTINCT (MAX(Films.gross-Films.budget))AS profit FROM Films
) AS Temp1
WHERE 
(
    (Films.gross-Films.budget)=Temp1.profit) 
    OR (Films.budget)=(SELECT DISTINCT Min(Films.budget) FROM Films)
)

However, that gave all sorts of errors like:
column "temp1.profit" must appear in the GROUP BY clause or be used in an aggregate function

I am not sure why it’s asking this but i tried nonetheless to give it what it wanted by adding the line

GROUP BY TEMP1.profit,Films.title,Films.year;

This made the error disappear but now all of the rows get the same ‘most profitable’ value to the feature column even if that’s not why they are there!

I don’t know if there’s any way to make this work, but even a completely different way to get that column working would be a great help.

3

Answers


  1. You can’t directly reuse a calculation from your WHERE clause in your SELECT clause, but you can get this effect using WITH queries or by moving the WHERE clause to an outer select. In your case I think combining both strategies gets the cleanest result:

    WITH max_profit AS (SELECT MAX(gross-budget) FROM Films),
    min_cost AS (SELECT MIN(budget) FROM Films)
    SELECT title, year, most_profitable, least_expensive FROM
      (SELECT 
          Films.title, Films.year, 
          (Films.gross-Films.budget = (SELECT * FROM max_profit)) AS most_profitable,
          (Films.budget = (SELECT * FROM min_cost)) AS least_expensive
       FROM Films
      )
      WHERE most_profitable OR least_expensive;
    

    The output here is a little different from what you were trying for: it should have columns named most_profitable and least_expensive that are set to true or false, with always at least one set to true. That seems better to me because maybe the same film could be both the cheapest and most profitable, but you can put your case statement back in the outer SELECT clause if you want:

    WITH max_profit AS (SELECT MAX(gross-budget) FROM Films),
    min_cost AS (SELECT MIN(budget) FROM Films)
    SELECT title, year, (CASE WHEN most_profitable THEN 'most profitable' ELSE 'least expensive' END) FROM
      (SELECT 
          Films.title, Films.year, 
          (Films.gross-Films.budget = (SELECT * FROM max_profit)) AS most_profitable,
          (Films.budget = (SELECT * FROM min_cost)) AS least_expensive
       FROM Films
      )
      WHERE most_profitable OR least_expensive;
    
    Login or Signup to reply.
  2. As a started: you don’t need these multiple subqueries on the films table to identify the most profitable and least expensive items! You can just use window functions. Here is a query that ranks all films by (decreasing) profit and (increasing) budget:

    select f.title, f.year,
        row_number() (order by gross - budget desc) rn_profit,
        row_number() (order by budget) rn_budget
    from films f
    

    We can then turn it to a subquery and filter on the film(s) we want:

    select *
    from (
        select f.title, f.year,
            row_number() (order by gross - budget desc) rn_profit,
            row_number() (order by budget) rn_budget
        from films f
    ) f
    where 1 in (rn_profit, rn_budget)
    

    This gives you two rows (or just one if a single film tops boths categories); the row where column rn_profit has value 1 indicates the most profitable movie, while a value of 1 in rn_budget indicates the less expensive. You can interpret this information in a case expression in the select clause as you wish.

    Login or Signup to reply.
  3. Another option is to use a limiting clause after sorting by the profit or bugdet:

    (
      select *, 'most profitable'
      from films
      order by gross - budget desc
      fetch first 1 rows with ties
    )
    union all
    (
      select *, 'least expensive'
      from films
      order by budget
      fetch first 1 rows with ties
    )
    

    Normally it’s not necessary to enclose the queries of a union with parentheses, but in this case it’s required because we have to apply a ORDER BY and limit on each one individually.
    union all

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search