This query:
select h.id
from history h
where
h.processed = true
order by coalesce (h.last_process_attempt_time, h.delivery_time)
limit 5
executes fast (6ms), because almost every row has processed = true
. But when I change to this:
explain analyze
select h.id
from pgz75008.history h
where
h.processed = false -- the only change is here
ORDER BY coalesce (h.last_process_attempt_time, h.delivery_time)
limit 5
now execution takes over 1.5 seconds (260x times more than previously). This is my index:
CREATE INDEX test_idx ON history USING btree (processed, last_process_attempt_time, delivery_time);
First query plan:
Limit (cost=18242.80..18242.81 rows=5 width=45) (actual time=6.866..6.869 rows=5 loops=1)
-> Sort (cost=18242.80..18468.04 rows=90095 width=45) (actual time=6.863..6.864 rows=5 loops=1)
Sort Key: (COALESCE(last_process_attempt_time, delivery_time))
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using test_idx on history h (cost=0.43..16746.36 rows=90095 width=45) (actual time=0.034..6.601 rows=1149 loops=1)
Index Cond: (processed = false)
Planning Time: 0.309 ms
Execution Time: 6.904 ms
second query plan:
Limit (cost=331050.32..331050.92 rows=5 width=45) (actual time=1497.691..1631.789 rows=5 loops=1)
-> Gather Merge (cost=331050.32..1127312.50 rows=6650224 width=45) (actual time=1484.761..1618.855 rows=5 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Sort (cost=330050.26..334206.65 rows=1662556 width=45) (actual time=1420.135..1420.138 rows=4 loops=5)
Sort Key: (COALESCE(last_process_attempt_time, delivery_time))
Sort Method: top-N heapsort Memory: 25kB
Worker 0: Sort Method: top-N heapsort Memory: 25kB
Worker 1: Sort Method: top-N heapsort Memory: 25kB
Worker 2: Sort Method: top-N heapsort Memory: 25kB
Worker 3: Sort Method: top-N heapsort Memory: 25kB
-> Parallel Seq Scan on history h (cost=0.00..302435.80 rows=1662556 width=45) (actual time=11.190..1000.568 rows=1347834 loops=5)
Filter: processed
Rows Removed by Filter: 230
Planning Time: 0.233 ms
JIT:
Functions: 21
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 8.992 ms, Inlining 0.000 ms, Optimization 5.794 ms, Emission 61.213 ms, Total 75.999 ms
Execution Time: 1634.044 ms
Any ideas how can I make it faster?
2
Answers
What if
Booleans are difficult to be useful in an index, since you have only three options: null, true and false. Pretty easy to be useless.
What if you create an index like this:
Maybe it’s useful to include the processed column, like this: