We have a table foo_tbl
(name obsfucated, same data type and DDL):
CREATE TABLE public.foo_tbl (
id int8 NOT NULL,
foo_id varchar(11) NOT NULL,
foo_date timestamptz NULL,
-- ... other unrelated columns ...
CONSTRAINT pk_footbl PRIMARY KEY (id)
);
CREATE INDEX idx_1_2cols ON public.foo_tbl USING btree (foo_date, foo_id); -- initial index
CREATE INDEX idx_2_1col ON public.foo_tbl USING btree (foo_id); -- added later, when the query is slow
We have a big query that join 7 tables with this table using foo_id
and getting foo_date
out.
Example (real query is much bigger):
select b.bar_code, f.foo_date from bar_tbl b join foo_tbl f on b.bar_id = f.foo_id limit 100;
Without the join with foo_tbl
the query is fast (< 2s).
After add the join with foo_tbl
, the query is much slower (> 15s), despite running an "Index Only Scan" on foo_tbl
using index idx_1_2cols
(only these 2 columns of this table are used in the query). This is the EXPLAIN ANALYZE
result for the table:
{
"Node Type": "Index Only Scan",
"Parent Relationship": "Inner",
"Parallel Aware": false,
"Scan Direction": "Forward",
"Index Name": "idx_1_2cols",
"Relation Name": "foo_tbl",
"Schema": "public",
"Alias": "f",
"Startup Cost": 0.42,
"Total Cost": 2886.11,
"Plan Rows": 1,
"Plan Width": 20,
"Actual Startup Time": 12.843,
"Actual Total Time": 13.068,
"Actual Rows": 1,
"Actual Loops": 1200,
"Output": ["f.foo_date", "f.foo_id"],
"Index Cond": "(f.foo_id = (b.bar_id)::text)",
"Rows Removed by Index Recheck": 0,
"Heap Fetches": 0,
"Shared Hit Blocks": 2284772,
"Shared Read Blocks": 0,
"Shared Dirtied Blocks": 0,
"Shared Written Blocks": 0,
"Local Hit Blocks": 0,
"Local Read Blocks": 0,
"Local Dirtied Blocks": 0,
"Local Written Blocks": 0,
"Temp Read Blocks": 0,
"Temp Written Blocks": 0,
"I/O Read Time": 0.0,
"I/O Write Time": 0.0
}
To investigate, we created the single column index idx_2_1col
and the query is fast again (< 3s). When EXPLAIN, the planner choose the new index over the old index for an "Index scan":
{
"Node Type": "Index Scan",
"Parent Relationship": "Inner",
"Parallel Aware": false,
"Scan Direction": "Forward",
"Index Name": "idx_2_1col",
"Relation Name": "foo_tbl",
"Schema": "public",
"Alias": "f",
"Startup Cost": 0.42,
"Total Cost": 0.46,
"Plan Rows": 1,
"Plan Width": 20,
"Actual Startup Time": 0.007,
"Actual Total Time": 0.007,
"Actual Rows": 1,
"Actual Loops": 1200,
"Output": ["f.foo_date", "f.foo_id"],
"Index Cond": "((f.foo_id)::text = (b.bar_id)::text)",
"Rows Removed by Index Recheck": 0,
"Shared Hit Blocks": 4800,
"Shared Read Blocks": 0,
"Shared Dirtied Blocks": 0,
"Shared Written Blocks": 0,
"Local Hit Blocks": 0,
"Local Read Blocks": 0,
"Local Dirtied Blocks": 0,
"Local Written Blocks": 0,
"Temp Read Blocks": 0,
"Temp Written Blocks": 0,
"I/O Read Time": 0.0,
"I/O Write Time": 0.0
}
So, why is the Index scan in this case faster than the Index only scan? And why is the Index only scan so slow?
Notes:
- Already
VACUUM ANALYZE
beforeEXPLAIN ANALYZE
the query - The
foo_tbl
is not the biggest, just some hundred thousands records, some tables in the join contains millions of records. - DBS is Amazon Aurora PostgreSQL-Compatible 13.5 (not serverless)
2
Answers
Since the DDL for
bar_tbl
hasn’t been provided, the following is just an educated guess.The query planner apparently chose to use
idx_1_2cols
because it has both columns that are needed and the planner estimated that using the index would be more efficient than scanning the base table (it most definitely should be since reading the table would require many more block reads due to the unneeded columns).The problem is that the join is on
foo_id
, but the index prefix isfoo_date
. Change the index order to (foo_id, foo_date
) and the query will run much more quickly.Adding
idx_2_1col
improved performance because the join betweenbar_id
andfoo_id
could be made efficiently using the index, even though reads from the base table were also needed to satisfy the query.The left-most column in a multi column index is the one that should be queried. In your case, the
foo_date
is only returned and the value check is done solely on the second columnfoo_id
.The doc is clear about this and even state that in such situation the entire index is to be scanned and the planner is likely to scan the entire table instead.
You can try switching the columns in the index, or to create a covering index by including the date in the 2nd index to avoid touching the table.