skip to Main Content

In MySQL version 8 series, there is SQL that uses aggregation functions such as count. LEFT JOIN is performed, and the pattern with "IS NULL" specified in the WHERE clause behaves differently than the pattern with "IS NULL" specified in the ON clause. Why?

-- pattern1
SELECT count(a.value)
FROM A AS a
LEFT JOIN B AS b
ON a.id = b.id
WHERE b.value IS NULL
;


-- pattern2
SELECT count(a.value)
FROM A AS a
LEFT JOIN B AS b
ON a.id = b.id
AND b.value IS NULL
;


-- pattern1's result = 3549
-- pattern2's result = 4000

A table has 4000 records.

4

Answers


  1. The first one applied the b.value IS NULL condition durin the join, whereas the second one applies the b.value IS NULL to final result set.

    See dbfiddle.

    Login or Signup to reply.
  2. When you use WHERE b.value IS NULL, the output will only have any records having b.value = NULL. I.e. no matching records are found in B OR matched B record has value = NULL. Some records from table A might not end up in the result that are matching and have b.value not null.

    While, if you use AND b.value IS NULL, that means, from the B table, ONLY THOSE records will be selected as RIGHT table output where the value is null. But from A table, ALL the records will result in the output – including those having b.value NOT NULL.

    So, Pattern 2 may give MORE results in output.

    Login or Signup to reply.
  3. When the condition by one table is placed in WHERE then it is applied after the joining.

    When the condition by one table is placed in ON then it is applied during (but this looks like before) the joining.


    CREATE TABLE t1 
    SELECT 1 id, 1 val UNION ALL
    SELECT 2, 2 UNION ALL
    SELECT 3, 3;
    
    CREATE TABLE t2 
    SELECT 1 id, 1 val UNION ALL
    SELECT 2, 2 UNION ALL
    SELECT 4, 4 UNION ALL
    SELECT NULL, 5;
    
    SELECT *
    FROM t1
    LEFT JOIN t2 ON t1.id = t2.id;
    
    id val id val
    1 1 1 1
    2 2 2 2
    3 3 null null

    First SELECT joins the tables without additional condition.

    SELECT *
    FROM t1
    LEFT JOIN t2 ON t1.id = t2.id
    WHERE t2.id IS NULL;
    
    id val id val
    3 3 null null

    Second SELECT contains additional condition by right table in WHERE. This condition is applied after joining, i.e. to the rowset which is returned by first SELECT, and only one row is returned.

    SELECT *
    FROM t1
    LEFT JOIN t2 ON t1.id = t2.id
                AND t2.id IS NULL;
    
    SELECT * FROM t2 WHERE t2.id IS NULL;
    
    id val id val
    1 1 null null
    2 2 null null
    3 3 null null
    id val
    null 5

    Third SELECT contains additional condition by right table in ON, which is applied to the source table. This is shown additionally.

    SELECT *
    FROM t1
    LEFT JOIN t2 ON t1.id = t2.id
    WHERE t1.id > 1;
    
    id val id val
    2 2 2 2
    3 3 null null

    Forth SELECT contains additional condition by left table in WHERE. This condition is applied after joining, i.e. to the rowset which is returned by first SELECT, and only two rows is returned.

    SELECT *
    FROM t1
    LEFT JOIN t2 ON t1.id = t2.id
                AND t1.id > 1;
    
    SELECT * FROM t1 WHERE t1.id > 1;
    
    id val id val
    1 1 null null
    2 2 2 2
    3 3 null null
    id val
    2 2
    3 3

    Fifth SELECT contains additional condition by left table in ON, which is applied to the source table. This is shown additionally.

    fiddle

    Login or Signup to reply.
  4. With it in the on clause, you get at least one result row per a row, with b rows joins whenever b.value is null

    With it in the where clause, you skip any a rows where one or more b rows exist and the b.values are all not null.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search