I have two Aurora PG databases, one at version 12.8, the other at 13.4. I have a table that looks like this:
CREATE TABLE IF NOT EXISTS table1
(
id character varying COLLATE pg_catalog."C" NOT NULL,
col1 character varying COLLATE pg_catalog."C" NOT NULL,
col2 bytea,
CONSTRAINT id_pkey PRIMARY KEY (id)
)
CREATE UNIQUE INDEX IF NOT EXISTS idx_col2
ON table1 USING btree
(col2 ASC NULLS LAST)
WHERE col2 IS NOT NULL;
CREATE UNIQUE INDEX IF NOT EXISTS idx_col1
ON table1 USING btree
(col1 COLLATE pg_catalog."C" ASC NULLS LAST)
The PG12 table has about 8 million rows, while the PG13 table has only around 200,000. Nonetheless, while queries on my PG13 table are consistently hitting my index, the PG12 queries are not. Example results of EXPLAIN ANALYZE
WHERE
col2 = 'x3be8f76fd6199cbbcd4134bf505266841579817de7f3e59fe3947db6b5279fe2' OR
col1 = 'ORrKzFeI37dV-bnk1heGopi61koa9fmO'
LIMIT 1;
-- in PG12:
Limit (cost=0.00..8.26 rows=1 width=32) (actual time=1614.602..1614.603 rows=0 loops=1)
-> Seq Scan on table1 (cost=0.00..308297.01 rows=37344 width=32) (actual time=1614.601..1614.601 rows=0 loops=1)
Filter: ((col2 = 'x3be8f76fd6199cbbcd4134bf505266841579817de7f3e59fe3947db6b5279fe2'::bytea) OR ((col1)::text = 'ORrKzFeI37dV-bnk1heGopi61koa9fmO'::text))
Rows Removed by Filter: 7481857
Planning Time: 0.478 ms
Execution Time: 1614.623 ms
-- PG13:
Limit (cost=8.58..12.60 rows=1 width=32) (actual time=0.022..0.022 rows=0 loops=1)
-> Bitmap Heap Scan on table1 (cost=8.58..12.60 rows=1 width=32) (actual time=0.021..0.021 rows=0 loops=1)
Recheck Cond: ((col2 = 'x3be8f76fd6199cbbcd4134bf505266841579817de7f3e59fe3947db6b5279fe2'::bytea) OR ((col1)::text = 'ORrKzFeI37dV-bnk1heGopi61koa9fmO'::text))
-> BitmapOr (cost=8.58..8.58 rows=1 width=0) (actual time=0.018..0.018 rows=0 loops=1)
-> Bitmap Index Scan on idx_authcol1_col2 (cost=0.00..4.15 rows=1 width=0) (actual time=0.009..0.009 rows=0 loops=1),
Index Cond: (col2 = 'x3be8f76fd6199cbbcd4134bf505266841579817de7f3e59fe3947db6b5279fe2'::bytea)
-> Bitmap Index Scan on ix_authcol1_col1 (cost=0.00..4.43 rows=1 width=0) (actual time=0.008..0.008 rows=0 loops=1)
Index Cond: ((col1)::text = 'ORrKzFeI37dV-bnk1heGopi61koa9fmO'::text)
Planning Time: 0.520 ms,
Execution Time: 0.053 ms
I can’t reproduce these results locally or figure out why postgres is deciding to use do a scan on the PG12 database and not sure if it’s a quirk of Aurora or the version of Postgres we’re using.
Note that if I query the fields individually, i.e. not using an OR
, it will hit the index for all queries in both DBs. It’s only when the OR
is used that the PG 12 db falls back to a sequential scan.
EDIT: A bit of additional info found. This table incurs heavy reads without a lot of updates, and virtually no deletes, which based on notes in the PG 13 Changelog, could be a reason for out-of-date statistics and therefore inaccurate planning:
Allow inserts, not only updates and deletes, to trigger vacuuming activity in autovacuum (Laurenz Albe, Darafei Praliaskouski)
2
Answers
It’s a question of bad statistics. The first execution thinks that almost 40000 rows match the
WHERE
condition, the second execution knows that it is not more than 1.Collect statistics with
and figure out why automatic statistics collection did not suffice.
First, check the statistics of the table are up to date as mentioned by @LaurenzAlbe.
Now the
OR
operator is not doing any favor to your query. The query:May be easier to optimize when rephrased as: