In MySQL version 8 series, there is SQL that uses aggregation functions such as count. LEFT JOIN is performed, and the pattern with "IS NULL" specified in the WHERE clause behaves differently than the pattern with "IS NULL" specified in the ON clause. Why?
-- pattern1
SELECT count(a.value)
FROM A AS a
LEFT JOIN B AS b
ON a.id = b.id
WHERE b.value IS NULL
;
-- pattern2
SELECT count(a.value)
FROM A AS a
LEFT JOIN B AS b
ON a.id = b.id
AND b.value IS NULL
;
-- pattern1's result = 3549
-- pattern2's result = 4000
A table has 4000 records.
4
Answers
The first one applied the
b.value IS NULL
condition durin the join, whereas the second one applies theb.value IS NULL
to final result set.See dbfiddle.
When you use
WHERE b.value IS NULL
, the output will only have any records havingb.value = NULL
. I.e. no matching records are found in B OR matched B record hasvalue = NULL
. Some records from table A might not end up in the result that are matching and have b.value not null.While, if you use
AND b.value IS NULL
, that means, from the B table, ONLY THOSE records will be selected as RIGHT table output where the value is null. But from A table, ALL the records will result in the output – including those having b.value NOT NULL.So, Pattern 2 may give MORE results in output.
When the condition by one table is placed in WHERE then it is applied after the joining.
When the condition by one table is placed in ON then it is applied during (but this looks like before) the joining.
First SELECT joins the tables without additional condition.
Second SELECT contains additional condition by right table in WHERE. This condition is applied after joining, i.e. to the rowset which is returned by first SELECT, and only one row is returned.
Third SELECT contains additional condition by right table in ON, which is applied to the source table. This is shown additionally.
Forth SELECT contains additional condition by left table in WHERE. This condition is applied after joining, i.e. to the rowset which is returned by first SELECT, and only two rows is returned.
Fifth SELECT contains additional condition by left table in ON, which is applied to the source table. This is shown additionally.
fiddle
With it in the on clause, you get at least one result row per a row, with b rows joins whenever b.value is null
With it in the where clause, you skip any a rows where one or more b rows exist and the b.values are all not null.