I have two tables "Tasks"(300k rows) and "TaskAssignedRoles"(500k rows).
CREATE TABLE public."Tasks" (
"ID" uuid NOT NULL,
"RowID" uuid NOT NULL,
"Planned" timestamptz NOT NULL,
CONSTRAINT "pk_Tasks" PRIMARY KEY ("RowID")
);
CREATE INDEX "idx_Tasks_ID" ON public."Tasks" USING btree ("ID");
CREATE TABLE public."TaskAssignedRoles" (
"ID" uuid NOT NULL,
"RowID" uuid NOT NULL,
"TaskRoleID" uuid NOT NULL,
"RoleID" uuid NOT NULL,
"RoleName" text NOT NULL,
"RoleTypeID" uuid NOT NULL,
"ParentRowID" uuid NULL,
CONSTRAINT "pk_TaskAssignedRoles" PRIMARY KEY ("RowID")
);
CREATE INDEX "idx_TaskAssignedRoles_ID"
ON public."TaskAssignedRoles" USING btree ("ID");
CREATE INDEX "ndx_TaskAssignedRoles_TaskRoleID"
ON public."TaskAssignedRoles" USING btree ("TaskRoleID")
INCLUDE ("ID", "RoleID", "RoleName", "RoleTypeID")
WHERE (("ParentRowID" IS NULL) AND ("TaskRoleID" = 'f726ab6c-a279-4d79-863a-47253e55ccc1'::uuid));
I’m running the query:
explain (analyze)
SELECT t.*, tarp."RoleID", tarp."RoleName"
FROM "Tasks" AS "t"
INNER JOIN "TaskAssignedRoles" AS tarp
ON "tarp"."ID" = "t"."RowID"
AND "tarp"."TaskRoleID" = 'f726ab6c-a279-4d79-863a-47253e55ccc1'
AND "tarp"."ParentRowID" IS NULL
I get the plan:
Merge Join (cost=73.06..41145.44 rows=229928 width=434) (actual time=0.024..945.202 rows=234668 loops=1)
Merge Cond: (t."RowID" = tarp."ID")
-> Index Scan using "pk_Tasks" on "Tasks" t (cost=0.42..13861.52 rows=234668 width=388) (actual time=0.007..227.215 rows=234668 loops=1)
-> Index Scan using "idx_TaskAssignedRoles_ID" on "TaskAssignedRoles" tarp (cost=0.42..23823.15 rows=229928 width=62) (actual time=0.013..541.680 rows=234668 loops=1)
Filter: (("ParentRowID" IS NULL) AND ("TaskRoleID" = 'f726ab6c-a279-4d79-863a-47253e55ccc1'::uuid))
Rows Removed by Filter: 280198
Planning Time: 0.353 ms
Execution Time: 953.534 ms
Tell me why Postgres does not use the ndx_TaskAssignedRoles_TaskRoleID index, although it fully matches the condition in the query?
2
Answers
I believe the problem is that while you think the index covers the query, it actually does not. Here are the columns from the
TaskAssignedRoles
table involved in the join condition:Your index only includes the
TaskRoleID
column. The thing to realize here is that theINCLUDE
clause only includes those column values at the leaf nodes of the B-tree; the columns are not otherwise structurally part of the index. Similarly, theWHERE
clause just restricts the records in the table to some subset before creating the index. But again, the columns in theWHERE
clause are not part of the index.Try using this version:
This index would seem to completely cover the join condition. The
INCLUDE
clause includes one column needed by the select.In my hands it is willing to use the index, against empty tables.
It probably doesn’t do that for you because it thinks it will be slower. To do a nested loop driven by your desired index, it thinks it would need to hit the other table and index 229,928 times. That is not cheap. It would rather do a bulk-friendly operation instead.