Postgresql - Select count query very slow on cold cache

JordiChac243n
July 29, 2024
164 views
0 votes
2 Answers

Problem

The following query takes 42 seconds when most of the data in not cached:

EXPLAIN (ANALYZE, BUFFERS) select count(*) from packages where company_id = 178381;
                                                                      QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=395914.63..395914.63 rows=1 width=8) (actual time=42411.940..42411.942 rows=1 loops=1)
   Buffers: shared hit=21775 read=94888
   I/O Timings: read=39723.315
   ->  Bitmap Heap Scan on packages  (cost=1053.07..395761.41 rows=306442 width=0) (actual time=83.104..42336.765 rows=322432 loops=1)
         Recheck Cond: (company_id = 178381)
         Heap Blocks: exact=116385
         Buffers: shared hit=21775 read=94888
         I/O Timings: read=39723.315
         ->  Bitmap Index Scan on packages_company_id_index  (cost=0.00..1037.75 rows=306442 width=0) (actual time=45.846..45.847 rows=325795 loops=1)
               Index Cond: (company_id = 178381)
               Buffers: shared hit=1 read=277
               I/O Timings: read=7.090
 Planning:
   Buffers: shared hit=2
 Planning Time: 0.237 ms
 Execution Time: 42413.042 ms

Running the query again immediately after is naturally much faster:

distru_prod=> EXPLAIN (ANALYZE, BUFFERS) select count(*) from packages where company_id = 178381;
                                                                      QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=395914.63..395914.63 rows=1 width=8) (actual time=416.943..416.957 rows=1 loops=1)
   Buffers: shared hit=116589
   ->  Bitmap Heap Scan on packages  (cost=1053.07..395761.41 rows=306442 width=0) (actual time=78.925..395.495 rows=322432 loops=1)
         Recheck Cond: (company_id = 178381)
         Heap Blocks: exact=116308
         Buffers: shared hit=116589
         ->  Bitmap Index Scan on packages_company_id_index  (cost=0.00..1037.75 rows=306442 width=0) (actual time=46.359..46.360 rows=325351 loops=1)
               Index Cond: (company_id = 178381)
               Buffers: shared hit=281
 Planning:
   Buffers: shared hit=448
 Planning Time: 1.375 ms
 Execution Time: 418.321 ms

More Info

PostgreSQL 14.12
Running on Google Cloud Platform (8 vCPUs, 50GB memory, SSD)
CPU usage was below 25% at the time of running the queries above

Here is some basic info about this table:

select count(distinct company_id) from packages;
 count
-------
   691

select count(*) from packages;
  count
----------
 10764441

select count(*) from packages where company_id = 178381;
 count
--------
 322432

select pg_size_pretty(pg_total_relation_size('packages'));
 pg_size_pretty
----------------
 12 GB

select pg_size_pretty(pg_total_relation_size('packages_company_id_index'));
 pg_size_pretty
----------------
 79 MB

Index being used:

CREATE INDEX packages_company_id_index ON public.packages USING btree (company_id);

Even when using pg_hint_plan to force an Index Only Scan on packages_company_id_index, buffers are still as high as when doing a Bitmap Heap Scan:

/*+ IndexOnlyScan(packages packages_company_id_index) */ EXPLAIN (ANALYZE, BUFFERS) select count(*) from packages where company_id = 178381;
                                                                           QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=517401.31..517401.32 rows=1 width=8) (actual time=172.448..172.450 rows=1 loops=1)
   Buffers: shared hit=116586
   ->  Index Only Scan using packages_company_id_index on packages  (cost=0.09..517248.09 rows=306442 width=0) (actual time=0.034..150.510 rows=322432 loops=1)
         Index Cond: (company_id = 178381)
         Heap Fetches: 325351
         Buffers: shared hit=116586
 Planning:
   Buffers: shared hit=2
 Planning Time: 0.238 ms
 Execution Time: 172.546 ms

Questions

Why does the Index Only Scan still need as many buffers as the original query even though the index is only 79MB in disk?
Why is Postgres not doing an Index Only Scan unless instructed to? It would appear this query can be accomplished without going to the packages table at all?
The original query takes 42 seconds to count 322k rows within a 10M rows table, even when there is a perfectly matching index for it.
1. This seems way too high even accounting for the cold cache, but maybe this performance is expected?
2. If it’s not expected, what could I be doing wrong?
3. If it’s expected, are there any ways to improve the performance of the query above on a cold cache without prewarming the cache?

Tags: postgresql postgresql-performance

Answers

- LaurenzAlbe
- July 29, 2024 at 8:56 am
- 0 votes
0
1. The Index-only scan reads so many buffers because it is not really an index-only scan. VACUUM the table to refresh the visibility map, and you’ll see the difference.
2. Probably because the visibility map is not updated and perhaps because you left random_page_cost at a value that is too high.
3. According to the data you posted, the table should be at least 900 MB in size (116589 pages at 8 kB each).
  1. Would you expect your disk to read more than 19MB/s?
  2. Unanswerable, because it depends on an unknown "if".
  3. See 1. and 2. above.
Login or Signup to reply.

- jjanes
- July 29, 2024 at 3:34 pm
- 0 votes
0
Depending on your storage sub-system, increasing the value of effective_io_concurrency might substantially improve the performance of the cold-cache bitmap scan, by allowing multiple IO requests to be outstanding at a time.

Of course updating the visibility map so that the index-only scan can work properly and so getting rid of or greatly reducing "Heap Fetches: 325351" might make this pointless.

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Select count query very slow on cold cache

Answers