skip to Main Content

So I’m trying to union two queries and then order by a column. However, whenever I try and run the query it gives an error that doesn’t make sense to me.

Example data

CREATE TABLE test (
    Company_code    VARCHAR(120) NOT NULL,
    operating_year  INT          NOT NULL,
    Profit          INT          NOT NULL
);

INSERT INTO test (Company_code, operating_year, Profit)
VALUES  ('A', 1999, 2000),
        ('A', 2000, 3000),
        ('B', 1999, 1600),
        ('B', 2000, 4000);

Query

SELECT 
    t.company_code,
    t.profit
FROM
    test t
WHERE
    t.company_code = 'A'

UNION

SELECT
    t.company_code,
    t.profit
FROM
    test t
WHERE
    t.company_code = 'B'

ORDER BY
--    t.profit; --- Does *not* work
--    profit;   --- Does work

Ignore the very basic example, and how just adding an OR to the WHERE statement resolves this.

My question is why does having an alias in the ORDER BY throw an error when a UNION is involved. But not when run individually?

2

Answers


  1. Because you’re not sorting the separate queries that reference columns through a table alias. You are ordering the UNION query output which is almost like a table itself.

    The ORDER BY is performed after the UNION this is equivalent:

    SELECT * FROM (
        SELECT 
        t.company_code,
        t.profit
        FROM test t
        WHERE t.company_code = 'A'
        UNION
        SELECT
        t.company_code,
        t.profit
        FROM test t
        WHERE t.company_code = 'B'
    ) t
    ORDER BY t.profit; --- Should work
    --ORDER BY profit; --- Does work
    
    Login or Signup to reply.
  2. My question is why does having an alias (t) in the ORDER BY throw an error when a UNION is involved. But not when run individually.

    When you use a set-theoretic operator between two queries (e.g. UNION, UNION ALL, EXCEPT, INTERSECT, etc) the ORDER BY clause now applies to the end result of that operator, which is a new anonymous derived-table. So there’s no way to bind t in ORDER BY because it is now out-of-scope.

    If you add parentheses around the derived-tables it’s easier to see why… so your query is really like this pseudo-SQL:

    SELECT
        company_code,
        profit
    FROM
        (
            SELECT 
                t.company_code,
                t.profit
            FROM
                test t
            WHERE
                t.company_code = 'A'
        ) AS q1
    
        UNION
    
        (
            SELECT 
                t.company_code,
                t.profit
            FROM
                test t
            WHERE
                t.company_code = 'B'
        ) AS q2
    ORDER BY
        t.profit <-- `t` is not in scope. It's masked by (or "buried inside"?) q1 and q2.
        profit   <-- this works because `profit` is in the result of the UNION
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search