Postgresql - SQL JOIN: WHERE clause and ON clause

Andrei
January 25, 2024
228 views
0 votes
2 Answers

I’m trying to figure out how exactly ON and WHERE work when doing a JOIN

My understanding based on other questions is that I should filter in where when the condition is on the left side of the join, and in the on clause when filtering on the right side

select * from A left join B 
ON A.something=B.something AND {some_condition_on_B} 
WHERE {some_condition_on_A}

I have encountered an issue with a partitioned A table

this select is very slow :

select * from 
A 
left join 
B ON A.something=B.something 
WHERE {A.foo = bar}

while this makes the query faster:

select * from 
A 
left join B ON A.something=B.something AND {A.foo = bar}
WHERE {A.foo = bar}

My question is how come the second select is faster ? Is it something I’m missing, because it seems that it shouldn’t make a difference

Tags: postgresql sql

Answers

- Monofuse
- January 25, 2024 at 12:30 pm
- 0 votes
0
```
select * from 
A 
left join 
B ON A.something=B.something 
WHERE {A.foo = bar}
```
This joins the tables together and anything which matches A.something = B.something will join. Anything else will have a bunch of null values (more data to be transfered before the final filter)
```
select * from 
A 
left join B ON A.something=B.something AND {A.foo = bar}
WHERE {A.foo = bar}
```
This one joins the tables together, but will result in more null records if nothing was found which matches the condition {A.foo = bar} as well as A.Something = B.something. So less data is being transferred around before we get to the Where condition.

Both at the end are checking if they are equal (a null will never equals what it is being compared against, even another null)
Login or Signup to reply.

Don’t ask people to explain, ask your PostgreSQL to explain: demo

create table A(something text,foo text,some_condition_on_A boolean);

create table B(something text,some_condition_on_B boolean);

explain--why "this select is very slow"
select * from 
A 
left join 
B ON A.something=B.something 
WHERE A.foo = 'bar'

QUERY PLAN
Hash Right Join (cost=20.93..49.48 rows=27 width=98)
Hash Cond: (b.something = a.something)
-> Seq Scan on b (cost=0.00..23.30 rows=1330 width=33)
-> Hash (cost=20.88..20.88 rows=4 width=65)
-> Seq Scan on a (cost=0.00..20.88 rows=4 width=65)
Filter: (foo = ‘bar’::text)

explain--why "this makes the query faster"
select * from 
A 
left join B ON A.something=B.something AND A.foo = 'bar'
WHERE A.foo = 'bar'

QUERY PLAN
Hash Right Join (cost=20.93..49.55 rows=4 width=98)
Hash Cond: (b.something = a.something)
Join Filter: (a.foo = ‘bar’::text)
-> Seq Scan on b (cost=0.00..23.30 rows=1330 width=33)
-> Hash (cost=20.88..20.88 rows=4 width=65)
-> Seq Scan on a (cost=0.00..20.88 rows=4 width=65)
Filter: (foo = ‘bar’::text)

Query plans heavily depend on the profile of your dataset, your schema and server config, traffic on the table, all of which can change in time and between versions. Anything anyone can tell you about why these queries behave the way they do on your setup based on just the statement will be a pretty useless blind guess.

If you instead ask your PostgreSQL to explain analyze, it’ll tell you exactly what it did, how and why, based on a very detailed knowledge of everything it depends on.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – SQL JOIN: WHERE clause and ON clause

Answers