I have huge words table which I’m running LIKE query on:
create table words
(
id int,
word varchar
)
It works pretty long. Index doesn’t help a lot, so I’m trying to partition it by word
column:
create table words
(
id int,
word varchar
) partition by RANGE (word);
CREATE TABLE words_1 PARTITION OF words
FOR VALUES FROM ('a') TO ('n');
CREATE TABLE words_2 PARTITION OF words
FOR VALUES FROM ('n') TO ('z');
NOTE: Actually I’m planning to make 1 partition for each letter. Use only 2 of them for example simplicity.
So partitioning seems to work OK with equality and gt/lt operators:
explain
select * from words where word = 'abc'
Seq Scan on words_1 words (cost=0.00..25.88 rows=6 width=36)
Filter: ((word)::text = 'abc'::text)
explain
select * from words where word >= 'nth'
Seq Scan on words_2 words (cost=0.00..25.88 rows=423 width=36)
Filter: ((word)::text >= 'nth'::text)
But on LIKE queries it keeps scanning both partitions:
explain
select * from words where word LIKE 'abc%'
Append (cost=0.00..51.81 rows=12 width=36)
-> Seq Scan on words_1 (cost=0.00..25.88 rows=6 width=36)
Filter: ((word)::text ~~ 'abc'::text)
-> Seq Scan on words_2 (cost=0.00..25.88 rows=6 width=36)
Filter: ((word)::text ~~ 'abc'::text)
Is there a way to make partitioning work on LIKE queries?
Maybe is there another way to achieve what i want?
2
Answers
I cannot replicate your results. Your queries use an index, even with only a handful of rows. Make sure your tables have been analyzed and run your queries with
explain analyze
.An improved table design would drop the
id
and use the word as a primary key, assuming they’re unique. I’ve added about 1000 words and analyzed the table.All of your queries do an index-only scan.
The default B-Tree index can do exact matches (
words = 'this'
), trailing wildcards (words like 'this%'
) and ordering. We can improve this further adding a Gist index using trigram ops.Now queries such as
word like '%this%'
will use the Gist index.I’m a bit surprised it doesn’t just work, at least in the C collation. But I can verify that it doesn’t.
You could rewrite the query manually the same way
word like 'abc%'
sometimes gets rewritten:But this is only guaranteed to give the same answer in the C collation.
By the way, you should check partition pruning with EXPLAIN ANALYZE. It is possible the partition pruning to only happen at run time, in which case all partitions still show up in the plan EXPLAIN plan. (But run-time pruning isn’t the case here, I checked)