skip to Main Content

Postgres behaves radically differently on parameters that don’t change the result.

-- This query is fast (~ 40 ms)
WITH filtered_scans AS (
  SELECT id FROM scans
  WHERE scans."createdAt" BETWEEN NOW() - INTERVAL '26 day' AND NOW() AND
    scans."applicationId" = '2ce67bbf-d740-4f4e-aaf8-33552d54e482'
)
SELECT "scanId", percentile_disc(0.5) WITHIN GROUP (ORDER BY duration) AS p50
FROM scan_exchanges
WHERE "scanId" IN (SELECT id FROM filtered_scans)
GROUP BY "scanId";

-- This one is slow (~ 2 s)
EXPLAIN ANALYZE WITH filtered_scans AS (
  SELECT id FROM scans
  WHERE scans."createdAt" BETWEEN NOW() - INTERVAL '27 day' AND NOW() AND
                                                 -- ⬆️ the only difference is here 
    scans."applicationId" = '2ce67bbf-d740-4f4e-aaf8-33552d54e482'
)
SELECT "scanId", percentile_disc(0.5) WITHIN GROUP (ORDER BY duration) AS p50
FROM scan_exchanges
WHERE "scanId" IN (SELECT id FROM filtered_scans)
GROUP BY "scanId";

In both cases, the filtered_scans table is the same:

id
2a823963-d019-4066-aab5-fed94b5f05fd
4684e3c7-bbba-4fb4-909c-fc6d512f5555
abf56697-2400-452c-a68a-e243d53bafb0

And the resulting table is also the same:

scanId p50
2a823963-d019-4066-aab5-fed94b5f05fd 0,09799999999813735
4684e3c7-bbba-4fb4-909c-fc6d512f5555 0,07500000001164153
abf56697-2400-452c-a68a-e243d53bafb0 0,1159999999217689

The table scan_exchanges has an index on scanId named scan_exchanges_scanId_idx. Table columns:

CREATE TABLE "scans" (
    "id" UUID NOT NULL,
    "createdAt" TIMESTAMP NOT NULL DEFAULT 'CURRENT_TIMESTAMP',
    "applicationId" UUID NOT NULL,
    PRIMARY KEY ("id"),
    INDEX "scans_applicationId_idx" ("applicationId")
)

CREATE TABLE "scan_exchanges" (
    "id" UUID NOT NULL,
    "duration" DOUBLE PRECISION NOT NULL,
    "scanId" UUID NOT NULL,
    PRIMARY KEY ("id"),
    INDEX "scan_exchanges_scanId_idx" ("scanId"),
    CONSTRAINT "scan_exchanges_scanId_fkey" FOREIGN KEY ("scanId") REFERENCES "scans" ("id") ON UPDATE CASCADE ON DELETE CASCADE
)
-- I removed some fields irrelevant to the question, tell me if I removed too much

I tried running ANALYZE EXPLAIN on the queries and Postgres 13.9.0 returned wildly different results:

Fast query:

QUERY PLAN
GroupAggregate  (cost=26.03..308595.21 rows=818 width=24) (actual time=24.336..26.808 rows=3 loops=1)
  Group Key: scan_exchanges."scanId"
  ->  Nested Loop  (cost=26.03..308330.41 rows=50915 width=24) (actual time=23.299..26.374 rows=1414 loops=1)
        ->  Index Scan using scans_pkey on scans  (cost=0.28..1488.47 rows=71 width=16) (actual time=23.211..24.283 rows=3 loops=1)
              Filter: (("applicationId" = '2ce67bbf-d740-4f4e-aaf8-33552d54e482'::uuid) AND ("createdAt" <= now()) AND ("createdAt" >= (now() - '26 days'::interval)))
              Rows Removed by Filter: 1482
        ->  Bitmap Heap Scan on scan_exchanges  (cost=25.75..4308.74 rows=1298 width=24) (actual time=0.054..0.623 rows=471 loops=3)
              Recheck Cond: ("scanId" = scans.id)
              Heap Blocks: exact=492
              ->  Bitmap Index Scan on "scan_exchanges_scanId_idx"  (cost=0.00..25.43 rows=1298 width=0) (actual time=0.034..0.034 rows=509 loops=3)
                    Index Cond: ("scanId" = scans.id)
Planning Time: 0.393 ms
JIT:
  Functions: 14
  Options: Inlining false, Optimization false, Expressions true, Deforming true
  Timing: Generation 2.845 ms, Inlining 0.000 ms, Optimization 1.138 ms, Emission 21.245 ms, Total 25.228 ms
Execution Time: 29.853 ms

Slow query:

QUERY PLAN
GroupAggregate  (cost=306783.39..307261.53 rows=818 width=24) (actual time=2028.021..2028.437 rows=3 loops=1)
  Group Key: scan_exchanges."scanId"
  ->  Sort  (cost=306783.39..306939.36 rows=62389 width=24) (actual time=2027.787..2027.933 rows=1414 loops=1)
        Sort Key: scan_exchanges."scanId"
        Sort Method: quicksort  Memory: 159kB
        ->  Hash Join  (cost=370.11..301814.42 rows=62389 width=24) (actual time=179.534..2027.085 rows=1414 loops=1)
              Hash Cond: (scan_exchanges."scanId" = scans.id)
              ->  Seq Scan on scan_exchanges  (cost=0.00..298648.46 rows=1062046 width=24) (actual time=25.341..1843.927 rows=1067166 loops=1)
              ->  Hash  (cost=369.02..369.02 rows=87 width=16) (actual time=0.538..0.540 rows=3 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 9kB
                    ->  Bitmap Heap Scan on scans  (cost=10.71..369.02 rows=87 width=16) (actual time=0.100..0.534 rows=3 loops=1)
                          Recheck Cond: ("applicationId" = '2ce67bbf-d740-4f4e-aaf8-33552d54e482'::uuid)
                          Filter: (("createdAt" <= now()) AND ("createdAt" >= (now() - '27 days'::interval)))
                          Rows Removed by Filter: 319
                          Heap Blocks: exact=156
                          ->  Bitmap Index Scan on "scans_applicationId_idx"  (cost=0.00..10.69 rows=322 width=0) (actual time=0.042..0.042 rows=322 loops=1)
                                Index Cond: ("applicationId" = '2ce67bbf-d740-4f4e-aaf8-33552d54e482'::uuid)
Planning Time: 0.689 ms
JIT:
  Functions: 19
  Options: Inlining false, Optimization false, Expressions true, Deforming true
  Timing: Generation 6.831 ms, Inlining 0.000 ms, Optimization 1.793 ms, Emission 23.113 ms, Total 31.737 ms
Execution Time: 2035.574 ms

I’m not experimented enough to understand what made Postgres create two different query plans.

What can I do to have Postgres use the fast query plan, regardless of the filter on scans."createdAt"?


Edit (29/03/2023):

I tried adding an index over ("applicationId", "createdAt", "id") but it unfortunately does not make the query more predictable:

Slow query:

QUERY PLAN
GroupAggregate  (cost=299278.22..305644.21 rows=818 width=24) (actual time=1122.330..1238.101 rows=3 loops=1)
  Group Key: scan_exchanges."scanId"
  ->  Gather Merge  (cost=299278.22..305366.87 rows=53423 width=24) (actual time=1121.841..1237.594 rows=1414 loops=1)
        Workers Planned: 1
        Workers Launched: 1
        ->  Sort  (cost=298278.21..298356.78 rows=31425 width=24) (actual time=1078.875..1078.947 rows=707 loops=2)
              Sort Key: scan_exchanges."scanId"
              Sort Method: quicksort  Memory: 101kB
              Worker 0:  Sort Method: quicksort  Memory: 59kB
              ->  Hash Join  (cost=10.91..295930.82 rows=31425 width=24) (actual time=132.823..1078.481 rows=707 loops=2)
                    Hash Cond: (scan_exchanges."scanId" = scans.id)
                    ->  Parallel Seq Scan on scan_exchanges  (cost=0.00..294275.33 rows=624733 width=24) (actual time=17.585..986.391 rows=535626 loops=2)
                    ->  Hash  (cost=9.97..9.97 rows=75 width=16) (actual time=0.093..0.095 rows=3 loops=2)
                          Buckets: 1024  Batches: 1  Memory Usage: 9kB
                          ->  Index Only Scan using app_created_id on scans  (cost=0.29..9.97 rows=75 width=16) (actual time=0.087..0.089 rows=3 loops=2)
                                Index Cond: (("applicationId" = '2ce67bbf-d740-4f4e-aaf8-33552d54e482'::uuid) AND ("createdAt" >= (now() - '27 days'::interval)) AND ("createdAt" <= now()))
                                Heap Fetches: 0
Planning Time: 0.436 ms
JIT:
  Functions: 26
  Options: Inlining false, Optimization false, Expressions true, Deforming true
  Timing: Generation 4.929 ms, Inlining 0.000 ms, Optimization 2.230 ms, Emission 32.063 ms, Total 39.222 ms
Execution Time: 1241.326 ms

(The fast query does not change and does not rely on the newly created index)

2

Answers


  1. Chosen as BEST ANSWER

    In this specific case, it turns out setting random_page_cost closer to seq_page_cost makes the query planner more predictable.

    SHOW seq_page_cost;    -- 1.0
    SHOW random_page_cost; -- 4.0
    SET random_page_cost = 1.2;
    

    Thank you @LaurenzAlbe for the tip and all others who helped

    See also:


  2. Asking "why?" questions about query planner choices probably doesn’t help much in solving your production problem: you don’t want anomalously slow queries but only sometimes. That’s the kind of unpredictability that makes your app seem flakey and erodes the trust of your users.

    Query planners are complex beasts, hard to explain.

    VACUUMing or ANALYZEing your tables may help, if the table has had lots of changes since it was previously done. The statistics used by the query planner can get out of whack, especially on a busy table.

    Here’s a suggestion that may make this query’s plan more predictable. Create a compound covering BTREE index like this.

    CREATE INDEX CONCURRENTLY IF NOT EXISTS app_created_id 
    USING BTREE
    ON scans (applicationId, createdAt, id)
    

    Your CTE can be satisfied with a clean index range scan on that index. That should make simpler query plans possible.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search