I have simple query
explain SELECT * FROM markets LEFT JOIN outcomes ON markets.uuid = outcomes.market_uuid WHERE markets.event_uuid = 'f286f78b-9e9c-4052-9785-7c908a8ab2c0';
and explain show me
Merge Right Join (cost=63274.17..905696.60 rows=78271 width=8201)
Merge Cond: (outcomes.market_uuid = markets.uuid)
-> Index Scan using outcomes_market_uuid_index on outcomes (cost=0.43..802379.27 rows=15654269 width=2776)
-> Sort (cost=63273.73..63336.34 rows=25043 width=5425)
Sort Key: markets.uuid
-> Bitmap Heap Scan on markets (cost=220.92..27250.08 rows=25043 width=5425)
Recheck Cond: (event_uuid = 'f286f78b-9e9c-4052-9785-7c908a8ab2c0'::uuid)
-> Bitmap Index Scan on markets_event_uuid_index (cost=0.00..214.66 rows=25043 width=0)
Index Cond: (event_uuid = 'f286f78b-9e9c-4052-9785-7c908a8ab2c0'::uuid)
JIT:
Functions: 9
Options: Inlining true, Optimization true, Expressions true, Deforming true
EXPLAIN (ANALYSE, BUFFERS) SELECT * FROM markets
LEFT JOIN outcomes ON markets.uuid = outcomes.market_uuid
WHERE markets.event_uuid = 'f286f78b-9e9c-4052-9785-7c908a8ab2c0';
Merge Right Join (cost=63274.17..905696.60 rows=78271 width=8201) (actual time=21215.656..21215.665 rows=2 loops=1)
Merge Cond: (outcomes.market_uuid = markets.uuid)
Buffers: shared hit=7635492 read=600026 written=2031
-> Index Scan using outcomes_market_uuid_index on outcomes (cost=0.43..802379.27 rows=15654269 width=2776) (actual time=0.111..19456.586 rows=8421452 loops=1)
Buffers: shared hit=7635488 read=600026 written=2031
-> Sort (cost=63273.73..63336.34 rows=25043 width=5425) (actual time=0.047..0.049 rows=1 loops=1)
Sort Key: markets.uuid
Sort Method: quicksort Memory: 26kB
Buffers: shared hit=4
-> Bitmap Heap Scan on markets (cost=220.92..27250.08 rows=25043 width=5425) (actual time=0.025..0.027 rows=1 loops=1)
Recheck Cond: (event_uuid = 'f286f78b-9e9c-4052-9785-7c908a8ab2c0'::uuid)
Heap Blocks: exact=1
Buffers: shared hit=4
-> Bitmap Index Scan on markets_event_uuid_index (cost=0.00..214.66 rows=25043 width=0) (actual time=0.018..0.018 rows=1 loops=1)
Index Cond: (event_uuid = 'f286f78b-9e9c-4052-9785-7c908a8ab2c0'::uuid)
Buffers: shared hit=3
Planning Time: 0.416 ms
JIT:
Functions: 9
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 2.989 ms, Inlining 4.691 ms, Optimization 200.586 ms, Emission 117.924 ms, Total 326.192 ms
Execution Time: 21218.828 ms
EXPLAIN ANALYZE SELECT * FROM markets
LEFT JOIN outcomes ON markets.uuid = outcomes.market_uuid
WHERE markets.event_uuid = 'f286f78b-9e9c-4052-9785-7c908a8ab2c0';
Hash Right Join (cost=98.69..681948.87 rows=269 width=4521) (actual time=31871.835..31871.846 rows=2 loops=1)
Hash Cond: (outcomes.market_uuid = markets.uuid)
-> Seq Scan on outcomes (cost=0.00..640757.69 rows=15654269 width=2776) (actual time=0.034..29402.201 rows=15654279 loops=1)
-> Hash (cost=97.62..97.62 rows=86 width=1745) (actual time=364.287..364.290 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Index Scan using markets_event_uuid_index on markets (cost=0.43..97.62 rows=86 width=1745) (actual time=0.027..0.032 rows=1 loops=1)
Index Cond: (event_uuid = 'f286f78b-9e9c-4052-9785-7c908a8ab2c0'::uuid)
Planning Time: 0.326 ms
JIT:
Functions: 12
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 3.773 ms, Inlining 3.798 ms, Optimization 226.614 ms, Emission 133.456 ms, Total 367.641 ms
Execution Time: 31875.826 ms
Some information:
- The database has been restored from a backup
- Postgres has been restarted on the server
- Another query with Left Join working perfect
- Just this query above, join + where causes the problem
- Query time 20-30 s, on right join table with 15 million records
- Table has primary_key and index
- Version Postgres 15.2
- On local base (Version Postgres 14.3) works perfect
Query mixed where/join with time 1-5 ms (works perfect)
EXPLAIN (ANALYSE, BUFFERS) SELECT * FROM markets
LEFT JOIN events ON markets.event_uuid = events.uuid
WHERE markets.event_uuid = 'f286f78b-9e9c-4052-9785-7c908a8ab2c0';
Buffers: shared hit=8
-> Bitmap Heap Scan on markets (cost=220.92..27250.08 rows=25043 width=5425) (actual time=0.030..0.031 rows=1 loops=1)
Recheck Cond: (event_uuid = 'f286f78b-9e9c-4052-9785-7c908a8ab2c0'::uuid)
Heap Blocks: exact=1
Buffers: shared hit=4
-> Bitmap Index Scan on markets_event_uuid_index (cost=0.00..214.66 rows=25043 width=0) (actual time=0.019..0.019 rows=1 loops=1)
Index Cond: (event_uuid = 'f286f78b-9e9c-4052-9785-7c908a8ab2c0'::uuid)
Buffers: shared hit=3
-> Materialize (cost=0.42..2.65 rows=1 width=5491) (actual time=0.045..0.045 rows=1 loops=1)
Buffers: shared hit=4
-> Index Scan using events_pkey on events (cost=0.42..2.64 rows=1 width=5491) (actual time=0.024..0.024 rows=1 loops=1)
Index Cond: (uuid = 'f286f78b-9e9c-4052-9785-7c908a8ab2c0'::uuid)
Buffers: shared hit=4
Planning Time: 0.384 ms
Execution Time: 0.253 ms
EXPLAIN (ANALYSE, BUFFERS) SELECT * FROM markets
LEFT OUTER JOIN outcomes ON markets.uuid = outcomes.market_uuid
WHERE markets.uuid = '6dc00c42-9b80-43a5-a263-5010648a38fd';
Buffers: shared hit=3 read=6
-> Index Scan using markets_pkey on markets (cost=0.43..2.65 rows=1 width=5425) (actual time=3.183..3.185 rows=1 loops=1)
Index Cond: (uuid = '6dc00c42-9b80-43a5-a263-5010648a38fd'::uuid)
Buffers: shared hit=1 read=3
-> Bitmap Heap Scan on outcomes (cost=781.94..78619.53 rows=78271 width=2776) (actual time=1.713..1.723 rows=2 loops=1)
Recheck Cond: (market_uuid = '6dc00c42-9b80-43a5-a263-5010648a38fd'::uuid)
Heap Blocks: exact=2
Buffers: shared hit=2 read=3
-> Bitmap Index Scan on outcomes_market_uuid_index (cost=0.00..762.37 rows=78271 width=0) (actual time=1.682..1.682 rows=2 loops=1)
Index Cond: (market_uuid = '6dc00c42-9b80-43a5-a263-5010648a38fd'::uuid)
Buffers: shared hit=1 read=2
Planning Time: 0.360 ms
Execution Time: 4.975 ms
Why engine Postgres change my query? Cache? Bug? Can you help me?
2
Answers
Downgrade postgres to 14.8 fix the problem. Thank you for your help. Thanks to you I understood a bit better how the database works and I learned new ways to analyze queries.
SQL is a query language not a programming language. The difference is that in programming language you write the exact code that the computer must execute, and in a query langage yous just specify what you want to have in the result set, not how !
So the optimizer, like Madame Mary Watson (Sherlock Holms landlady) must accomplish the task of answering your request in the smartest way, and that’s why he rewrote your request, which is a mathematical equivalent sctict in order to answer you faster…