Why PostgreSQL doesn't use an index for JOIN

b0bi
January 27, 2024
163 views
0 votes
2 Answers

I have two tables "Tasks"(300k rows) and "TaskAssignedRoles"(500k rows).

CREATE TABLE public."Tasks" (
"ID" uuid NOT NULL,
"RowID" uuid NOT NULL,
"Planned" timestamptz NOT NULL,
CONSTRAINT "pk_Tasks" PRIMARY KEY ("RowID")
);
CREATE INDEX "idx_Tasks_ID" ON public."Tasks" USING btree ("ID");

CREATE TABLE public."TaskAssignedRoles" (
"ID" uuid NOT NULL,
"RowID" uuid NOT NULL,
"TaskRoleID" uuid NOT NULL,
"RoleID" uuid NOT NULL,
"RoleName" text NOT NULL,
"RoleTypeID" uuid NOT NULL,
"ParentRowID" uuid NULL,
CONSTRAINT "pk_TaskAssignedRoles" PRIMARY KEY ("RowID")
);
CREATE INDEX "idx_TaskAssignedRoles_ID" 
    ON public."TaskAssignedRoles" USING btree ("ID");
CREATE INDEX "ndx_TaskAssignedRoles_TaskRoleID" 
    ON public."TaskAssignedRoles" USING btree ("TaskRoleID") 
        INCLUDE ("ID", "RoleID", "RoleName", "RoleTypeID")
        WHERE (("ParentRowID" IS NULL) AND ("TaskRoleID" = 'f726ab6c-a279-4d79-863a-47253e55ccc1'::uuid));

I’m running the query:

explain (analyze)
SELECT t.*, tarp."RoleID", tarp."RoleName"
FROM "Tasks" AS "t"
INNER JOIN "TaskAssignedRoles" AS tarp 
    ON "tarp"."ID" = "t"."RowID" 
        AND "tarp"."TaskRoleID" = 'f726ab6c-a279-4d79-863a-47253e55ccc1' 
        AND "tarp"."ParentRowID" IS NULL

I get the plan:

Merge Join (cost=73.06..41145.44 rows=229928 width=434) (actual time=0.024..945.202 rows=234668 loops=1)
   Merge Cond: (t."RowID" = tarp."ID")
   -> Index Scan using "pk_Tasks" on "Tasks" t (cost=0.42..13861.52 rows=234668 width=388) (actual time=0.007..227.215 rows=234668 loops=1)
   -> Index Scan using "idx_TaskAssignedRoles_ID" on "TaskAssignedRoles" tarp (cost=0.42..23823.15 rows=229928 width=62) (actual time=0.013..541.680 rows=234668 loops=1)
         Filter: (("ParentRowID" IS NULL) AND ("TaskRoleID" = 'f726ab6c-a279-4d79-863a-47253e55ccc1'::uuid))
         Rows Removed by Filter: 280198
Planning Time: 0.353 ms
Execution Time: 953.534 ms

Tell me why Postgres does not use the ndx_TaskAssignedRoles_TaskRoleID index, although it fully matches the condition in the query?

Answers

- TimBiegeleisen
- January 27, 2024 at 4:08 pm
- 0 votes
0
I believe the problem is that while you think the index covers the query, it actually does not. Here are the columns from the TaskAssignedRoles table involved in the join condition:
```
ID
TaskRoleID
ParentRowID
```
Your index only includes the TaskRoleID column. The thing to realize here is that the INCLUDE clause only includes those column values at the leaf nodes of the B-tree; the columns are not otherwise structurally part of the index. Similarly, the WHERE clause just restricts the records in the table to some subset before creating the index. But again, the columns in the WHERE clause are not part of the index.

Try using this version:
```
CREATE INDEX "ndx_TaskAssignedRoles_TaskRoleID" 
ON public."TaskAssignedRoles" USING btree ("TaskRoleID", "RoleID", "ID")
    INCLUDE ("RoleName")
    WHERE "ParentRowID" IS NULL
```
This index would seem to completely cover the join condition. The INCLUDE clause includes one column needed by the select.
Login or Signup to reply.

- jjanes
- January 27, 2024 at 4:55 pm
- 0 votes
0
In my hands it is willing to use the index, against empty tables.

It probably doesn’t do that for you because it thinks it will be slower. To do a nested loop driven by your desired index, it thinks it would need to hit the other table and index 229,928 times. That is not cheap. It would rather do a bulk-friendly operation instead.

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.