Postgresql - multiple orders slower when adding additional OR to WHERE condition

AhmedAbderraham
May 22, 2023
200 views
2 votes
3 Answers

In a PostgreSQL 14 database I have a transactions table currently containing around 400 million elements and the following query on it:

SELECT "transactions"."id"
FROM "transactions"
WHERE ("wallet_id" = $1)
ORDER BY "transactions"."id" DESC
LIMIT 10

This works fine and the query is fast. The EXPLAIN ANALYZE output:

Limit  (cost=84991.29..84991.31 rows=10 width=146) (actual time=1.518..1.519 rows=2 loops=1)
  ->  Sort  (cost=84991.29..85056.88 rows=26235 width=146) (actual time=1.517..1.518 rows=2 loops=1)
        Sort Key: id DESC
        Sort Method: quicksort  Memory: 25kB
        ->  Index Scan using transactions_wallet_id_index on transactions  (cost=0.57..84424.36 rows=26235 width=146) (actual time=1.080..1.497 rows=2 loops=1)
              Index Cond: (wallet_id = $1)
Planning Time: 0.850 ms
Execution Time: 1.682 ms

If I add a second wallet id to the where, then the query takes several minutes when both wallets have a small number of transactions (2 for the first one, 57 for the second one):

SELECT "transactions"."id"
FROM "transactions"
WHERE ("wallet_id" = $1 OR "wallet_id" = $2)
ORDER BY "transactions"."id" DESC
LIMIT 10

Its analysis:

Limit  (cost=0.57..117334.73 rows=10 width=146) (actual time=3683.524..944867.724 rows=10 loops=1)
  ->  Index Scan Backward using transactions_pkey on transactions  (cost=0.57..615664040.26 rows=52471 width=146) (actual time=3683.523..944867.715 rows=10 loops=1)
        Filter: ((wallet_id = $1) OR (wallet_id = $2))
        Rows Removed by Filter: 117937409
Planning Time: 0.810 ms
Execution Time: 944867.797 ms

After investigating this for hours, it seems the issue comes from using ORDER BY in combination with LIMIT, and indeed if I remove one of the two, the query runs fast. If at least one of the wallets has a non-small number of transactions, it also runs fast. id is the primary key and there’s an index on wallet_id.

This is very disappointing and frustrating. It’s my first time using Postgres and the fact that the query planner does such a poor job on such a simple query is honestly hard to understand.

I’d appreciate some suggestions on how to make the query faster for all cases.

I tried different things for several hours now, including running VACUUM (just in case) and ANALYZE on the table, but to no avail.

Answers

Chosen as BEST ANSWER
- AhmedAbderraham
- May 22, 2023 at 1:25 am
- 0 votes
0
After some more digging, I've finally solved the issue myself thanks to this blog post. It only applies if the ORDER BY column is an integer (or maybe just a number), by adding + 0 to it. Posting it in case it helps someone.

I also found several issues reported to PostgreSQL about problems like or related to this one, and I find the disregard for some real-world usage issues like these from some of the core developers incredibly appalling. Apparently they find an acceptable design decision that a query that runs in 0.3s by adding a + 0 to an ORDER BY on an integer field might take 20 minutes otherwise due to some abort-early funcionality in the planner gone wrong.

This has made me seriously reconsider our decision to migrate to Postgres, and we are already reviewing alternatives. Really sad.

(Edit)

- ErwinBrandstetter
- May 22, 2023 at 1:24 am
- 0 votes
0
Why?

The additional filter makes Postgres switch to a different query plan, that turns out to be extremely inefficient. The effect is triggered by chance, more or less. The underlying problem is that Postgres grossly under-estimates the selectivity of the WHERE clause. It expects that it can just walk the index on the PK (transactions_pkey) in the requested sort order and will find the few rows (LIMIT 10) soon enough. Turns out, the filter is extremely selective and Postgres has to skip over 118 million rows (!!!) before it finds enough matches (Rows Removed by Filter: 117937409). In cases where there are not enough rows to satisfy the limit, all rows have to be visited before Postgres can finally give up. The worst case scenario.

The decision was made based on wrong or misleading column statistics. If you can improve column statistics, that might fix the problem by itself. May be as simple as ANALYZE transactions;. There are various ways. Here is a related answer from just yesterday on dba.SE, with more on that (and links to more similar cases):
- Simple query with two WHERE clauses uses inferior index scan
Just so happens that I discussed the very same "brute force" hack with ORDER BY id + 0, which you found for your answer.

Solution

For your particular case, there is a different approach, too. To get the best performance possible, create a multicolumn index (once):
```
CREATE INDEX transactions_wallet_id_id_idx ON transactions (wallet_id, id DESC);
```
Increases performance for your simple query a lot, too. (Even if that’s pretty fast already.) You’ll see an index-only scan (or at least an index scan) without "Sort" step.

Then use this query:
```
(
SELECT id
FROM   transactions
WHERE  wallet_id = $1
ORDER  BY id DESC
LIMIT  10
)
UNION ALL
(
SELECT id
FROM   transactions
WHERE  wallet_id = $2
ORDER  BY id DESC
LIMIT  10
)
ORDER BY id DESC
LIMIT 10;
```
All parentheses required.

Now the second query is just about twice the cost of the first, i.e. very fast.
Login or Signup to reply.

If you look at this issue from query planner’s perspective, it has two significant operations to do here:

filter the data
sort the data

For 1., the index you have created, transactions_wallet_id_index, is preferable. For 2., the index that comes with the primary key is better (well, scanning it backwards is). Note how your optimal query has a logical Sort operation after the data is filtered out, while the one for OR does not, it just has a limit.

I created a 2 mil table to recreate your scenario.

select wallet_id, count(*) from transactions
group by wallet_id
order by wallet_id

wallet_id	count(*)
0	2
1	12
2	48
3	192
4	768
5	3072
6	12288
7	49152
8	196608
9	786432
10	951426

Now if we select for a very small wallet, like say 2 and 3, it’ll look like what you’re expecting, a bitmap or of the two conditions:

explain analyze select id from transactions where wallet_id = 2 or wallet_id = 3 order by id desc limit 10

Limit  (cost=502.54..502.57 rows=10 width=4) (actual time=0.101..0.104 rows=10 loops=1)
  ->  Sort  (cost=502.54..502.87 rows=133 width=4) (actual time=0.100..0.102 rows=10 loops=1)
        Sort Key: id DESC
        Sort Method: top-N heapsort  Memory: 25kB
        ->  Bitmap Heap Scan on transactions  (cost=9.93..499.67 rows=133 width=4) (actual time=0.033..0.060 rows=240 loops=1)
              Recheck Cond: ((wallet_id = 2) OR (wallet_id = 3))
              Heap Blocks: exact=3
              ->  BitmapOr  (cost=9.93..9.93 rows=133 width=0) (actual time=0.024..0.024 rows=0 loops=1)
                    ->  Bitmap Index Scan on ux1  (cost=0.00..4.44 rows=1 width=0) (actual time=0.014..0.015 rows=48 loops=1)
                          Index Cond: (wallet_id = 2)
                    ->  Bitmap Index Scan on ux1  (cost=0.00..5.42 rows=133 width=0) (actual time=0.009..0.009 rows=192 loops=1)
                          Index Cond: (wallet_id = 3)
Planning Time: 0.097 ms
Execution Time: 0.125 ms

But at some point, we see what you’re seeing, wallets 3 & 4:

explain analyze select id from transactions where wallet_id = 3 or wallet_id = 4 order by id desc limit 10

Limit  (cost=0.43..728.01 rows=10 width=4) (actual time=171.911..171.915 rows=10 loops=1)
  ->  Index Scan Backward using transactions_pkey on transactions  (cost=0.43..72758.43 rows=1000 width=4) (actual time=171.910..171.912 rows=10 loops=1)
        Filter: ((wallet_id = 3) OR (wallet_id = 4))
        Rows Removed by Filter: 999489
Planning Time: 0.095 ms
Execution Time: 171.932 ms

The query planner just thinks sorting that set is going to be more expensive than filtering it.

There’s not really a way around it given this query (that I would know of), OR queries are just messy when it comes to statistics, and it’s definitely not just Postgres that suffers from it, believe me.

You can try ANALYZE transactions, but chances are your stats are up to date. Extended statistics won’t help you here either. You could possibly tweak some of the performance settings (especially work_mem and maybe random_page_cost) to force the correct plan here, but you run into messing up other queries.

One sure way to optimize this is to do it yourself. As I said, OR queries are problematic. Try creating a supporting index for your query and we can try a couple options.

create index ux1 on transactions (wallet_id, id desc)

This may come as a surprise to someone used to SQL Server for example, but you can get a different plan with Postgres by using IN (in SQL Server IN gets translated to a bunch of ORs:

explain analyze select id from transactions where wallet_id in (3, 4) order by id desc limit 10

And you get a nice:

Limit  (cost=55.96..55.99 rows=10 width=4) (actual time=0.250..0.252 rows=10 loops=1)
  ->  Sort  (cost=55.96..58.46 rows=1000 width=4) (actual time=0.249..0.250 rows=10 loops=1)
        Sort Key: id DESC
        Sort Method: top-N heapsort  Memory: 25kB
        ->  Index Only Scan using ux1 on transactions  (cost=0.43..34.36 rows=1000 width=4) (actual time=0.022..0.144 rows=960 loops=1)
              Index Cond: (wallet_id = ANY ('{3,4}'::integer[]))
              Heap Fetches: 0
Planning Time: 0.092 ms
Execution Time: 0.268 ms

You can also try messing with the planner’s head by telling "these are not the IDs you’re looking for" (or have an index for):

explain analyze select id from transactions where wallet_id = 3 or wallet_id = 4 order by 0+id desc limit 10

(note 0+id – 1*id, id+id would also work, anything that makes it look different from what the existing index / pk is on):

Limit  (cost=3027.11..3027.13 rows=10 width=8) (actual time=0.332..0.334 rows=10 loops=1)
  ->  Sort  (cost=3027.11..3029.61 rows=1000 width=8) (actual time=0.331..0.332 rows=10 loops=1)
        Sort Key: ((1 * id)) DESC
        Sort Method: top-N heapsort  Memory: 25kB
        ->  Bitmap Heap Scan on transactions  (cost=16.86..3005.50 rows=1000 width=8) (actual time=0.047..0.171 rows=960 loops=1)
              Recheck Cond: ((wallet_id = 3) OR (wallet_id = 4))
              Heap Blocks: exact=7
              ->  BitmapOr  (cost=16.86..16.86 rows=1000 width=0) (actual time=0.037..0.038 rows=0 loops=1)
                    ->  Bitmap Index Scan on ux1  (cost=0.00..5.93 rows=200 width=0) (actual time=0.016..0.016 rows=192 loops=1)
                          Index Cond: (wallet_id = 3)
                    ->  Bitmap Index Scan on ux  (cost=0.00..10.43 rows=800 width=0) (actual time=0.021..0.021 rows=768 loops=1)
                          Index Cond: (wallet_id = 4)
Planning Time: 0.094 ms
Execution Time: 0.355 ms

Under some circumstances, you could also get away with just splitting the query into two and unioning them together:

explain analyze select id from (select id from transactions where wallet_id = 3 union all select id from transactions where wallet_id = 4) q order by id desc limit 10

Limit  (cost=70.96..70.99 rows=10 width=4) (actual time=0.539..0.542 rows=10 loops=1)
  ->  Sort  (cost=70.96..73.46 rows=1000 width=4) (actual time=0.538..0.540 rows=10 loops=1)
        Sort Key: transactions.id DESC
        Sort Method: top-N heapsort  Memory: 25kB
        ->  Append  (cost=0.43..49.35 rows=1000 width=4) (actual time=0.030..0.349 rows=960 loops=1)
              ->  Index Only Scan using ux1 on transactions  (cost=0.43..7.93 rows=200 width=4) (actual time=0.029..0.065 rows=192 loops=1)
                    Index Cond: (wallet_id = 3)
                    Heap Fetches: 0
              ->  Index Only Scan using ux1 on transactions transactions_1  (cost=0.43..26.43 rows=800 width=4) (actual time=0.012..0.179 rows=768 loops=1)
                    Index Cond: (wallet_id = 4)
                    Heap Fetches: 0
Planning Time: 0.176 ms
Execution Time: 0.569 ms

(this might not work very well for bigger wallets though)

You have plenty of options, IN should work best, but you might want to try them all to see which produces the best results.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – multiple orders slower when adding additional OR to WHERE condition

Answers

Why?

Solution