Postgresql - Slow query with filter on Postgres (using index, requiring many different filters)

AlexeyStepanov
October 27, 2024
45 views
0 votes
2 Answers

I have a table:

create table accounts_service.operation_history
(
    history_id                 bigint generated always as identity
        primary key,
    operation_id               varchar(36)                            not null
        unique,
    operation_type             varchar(30)                            not null,
    operation_time             timestamp with time zone default now() not null,
    from_phone                 varchar(20),
    user_id                    varchar(21),
-- and a lot of another varchar(x), text and even couple of number, boolean, jsonb, timestamp columns
);


create index operation_history_user_id_operation_time_idx
    on accounts_service.operation_history (user_id, operation_time);

create index operation_history_operation_time_idx
    on accounts_service.operation_history (operation_time);

I want to make a simple select with a where filter on operation_time (this is a required filter and can be a day or two) as well as additional filters for other columns: commonly, with varchar(x) type.

But my queries are slow:

explain (buffers, analyze)
select *
from operation_history operationh0_
where (null is null or operationh0_.user_id = null)
  and operationh0_.operation_time >= '2024-09-30 20:00:00.000000 +00:00'
  and operationh0_.operation_time <= '2024-10-02 20:00:00.000000 +00:00'
  and (operationh0_.from_phone = '+000111223344')
order by operationh0_.operation_time asc, operationh0_.history_id asc
limit 25;

Limit  (cost=8063.39..178328.00 rows=25 width=1267) (actual time=174373.106..174374.395 rows=0 loops=1)
  Buffers: shared hit=532597 read=1433916
  I/O Timings: read=517880.241
  ->  Incremental Sort  (cost=8063.39..198759.76 rows=28 width=1267) (actual time=174373.105..174374.394 rows=0 loops=1)
        Sort Key: operation_time, history_id
        Presorted Key: operation_time
        Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 25kB  Peak Memory: 25kB
        Buffers: shared hit=532597 read=1433916
        I/O Timings: read=517880.241
        ->  Gather Merge  (cost=1000.60..198758.50 rows=28 width=1267) (actual time=174373.099..174374.388 rows=0 loops=1)
              Workers Planned: 2
              Workers Launched: 2
              Buffers: shared hit=532597 read=1433916
              I/O Timings: read=517880.241
              ->  Parallel Index Scan using operation_history_operation_time_idx on operation_history operationh0_  (cost=0.57..197755.24 rows=12 width=1267) (actual time=174362.932..174362.933 rows=0 loops=3)
                    Index Cond: ((operation_time >= '2024-09-30 20:00:00+00'::timestamp with time zone) AND (operation_time <= '2024-10-02 20:00:00+00'::timestamp with time zone))
                    Filter: ((from_phone)::text = '+000111223344'::text)
                    Rows Removed by Filter: 723711
                    Buffers: shared hit=532597 read=1433916
                    I/O Timings: read=517880.241
Planning Time: 0.193 ms
Execution Time: 174374.449 ms

For simplicity:

set max_parallel_workers_per_gather = 0;

It’s just simplifying plan, numbers are relevant. Retry the previous query:

Limit  (cost=7535.40..189179.35 rows=25 width=1267) (actual time=261432.728..261432.729 rows=0 loops=1)
  Buffers: shared hit=374346 read=1591362
  I/O Timings: read=257253.065
  ->  Incremental Sort  (cost=7535.40..210976.63 rows=28 width=1267) (actual time=261432.727..261432.727 rows=0 loops=1)
        Sort Key: operation_time, history_id
        Presorted Key: operation_time
        Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 25kB  Peak Memory: 25kB
        Buffers: shared hit=374346 read=1591362
        I/O Timings: read=257253.065
        ->  Index Scan using operation_history_operation_time_idx on operation_history operationh0_  (cost=0.57..210975.37 rows=28 width=1267) (actual time=261432.720..261432.720 rows=0 loops=1)
              Index Cond: ((operation_time >= '2024-09-30 20:00:00+00'::timestamp with time zone) AND (operation_time <= '2024-10-02 20:00:00+00'::timestamp with time zone))
              Filter: ((from_phone)::text = '+000111223344'::text)
              Rows Removed by Filter: 2171134
              Buffers: shared hit=374346 read=1591362
              I/O Timings: read=257253.065
Planning Time: 0.170 ms
Execution Time: 261432.774 ms

So it filtered just 2 171 134 rows and it was more than 4 mins. Seems it is too long, isn’t it?

I tried selecting specific columns (e.g. operation_time, from_phone, to_phone, history_id), it had no effect.
I tried vacuum analyze, it had no effect.
I checked some parameters of postgres, like shared_buffers, work_mem, etc. Changing it has no effect.
I compared it with pgTune and it’s ok.

Some another info:

SELECT relpages, pg_size_pretty(pg_total_relation_size(oid)) AS table_size
FROM pg_class
WHERE relname = 'operation_history';


18402644	210 GB

select count(*) from operation_history;


352402877

Server drives: AWS gp3
I don’t want to create indexes for all columns because there are massive writes to this table…

Is there any way to optimize it?
Or is it just making a lot of reads from the index and the table and it’s ok and we need to do sharding, etc?

UPD:
I checked index bloat with this query:

idxname	real_size	extra_size	extra_pct	fillfactor	bloat_size	bloat_pct	is_na
operation_history_operation_time_idx	7839301632	746373120	9.520913405772113	90	0	-0.6147682314362566	false

I checked table bloat with this query

tblname	real_size	extra_size	extra_pct	fillfactor	bloat_size	bloat_pct	is_na
operation_history	150754459648	7987224576	5.298168024116535	100	7987224576	5.298168024116535	false

And it seems to be ok

Answers

- ErwinBrandstetter
- October 26, 2024 at 11:47 am
- 0 votes
0
The I/O performance you get from AWS is abysmally bad. Even an un-optimized query with poor index support like the one at hand should never take minutes.

For the query at hand: it filters all 2171134 rows that qualify after the filters on operation_time. What it really would need is an index on (from_phone), ideally in its combined form: (from_phone, operation_time) , fields in this order. See:
- Multicolumn index and performance
Would be a Get Out of Jail Free card for this query. But you mentioned variations. The more fields can be filtered, the harder it gets to cover everything with indexes.

Upgrading from the aging Postgres version 13 (EOL next year) should also help.
And more RAM, faster I/O, obviously.
And/or more CPUs, coupled with a higher setting for max_parallel_workers_per_gather.

If the total relation size of 210 GB is much bigger than expected, tables and indexes might be bloated. (Investigate to verify!) The brute-force, built-in tool to fix – with a long exclusive lock on the table:
```
VACUUM FULL ANALYZE accounts_service.operation_history;
```
There are smarter alternatives:
- Alternative to CLUSTER without table lock
After a rough calculation, probably not bloated. What I see here is already good for at least 110 GB in pristine condition. And you mentioned more columns. So that’s not it.

Asides
```
AND o.operation_time >= '2024-09-30 20:00:00.000000 +00:00'
AND o.operation_time <= '2024-10-02 20:00:00.000000 +00:00'
```
It rarely makes sense to include lower and upper bound in such a scenario. Should probably be:
```
AND o.operation_time >= '2024-09-30 20:00:00.000000 +00:00'
AND o.operation_time <  '2024-10-02 20:00:00.000000 +00:00'
```
operation_id varchar(36) ? I smell a UUID in ugly disguise. See:
- Would index lookup be noticeably faster with char vs varchar when all values are 36 chars
Login or Signup to reply.

- OJones
- October 27, 2024 at 12:52 pm
- 0 votes
0
It’s possible you can create a multicolumn BTREE ("indexed sequential") index that will support your query more efficiently. Try this one.
```
create index whatever_name
    on accounts_service.operation_history
      (from_phone, user_id, operation_time, history_id);
```
This index may be astonishingly fast because:
1. Your from_phone = 'whatever' filter is probably very selective, ruling out almost all the rows of the table.
2. Your user_id IS NULL filter isn’t as selective, but it is still a kind of equality filter.
3. You have a range filter on operation_time, and it is also one of your ordering columns.
  1, Your other ordering column is history_id.
If my guess about this index is correct, your filter will be satisfied by random-accessing the index to the first eligible row, then scanning it sequentially for your limit of 25 rows (or until the first ineligible row).

And, if you include this clause in the index
```
   include (operation_type)
```
It will be a so-called covering index. That is, the query planner can satisfy your query from the index alone, without needing to refer to the table’s heap. That should make it very fast indeed.

Pro tip When you have query performance problems, it’s very wise to avoid SELECT * and instead give the names of the columns you need. Leaving out unneeded columns may allow the query planner to avoid expensive data slinging.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Slow query with filter on Postgres (using index, requiring many different filters)

Answers

Asides