I have a PostgreSQL table named users which contains between 20 to 30 million records. The table has the following columns: id
, tenant_id
, name
, company
, location
, and description
.
I need to perform generic search queries on this table with various combinations of columns. Here is an example of such a query:
SELECT *
FROM users
WHERE name = 'raju'
AND company LIKE '%kg%'
AND description LIKE '%ght%'
AND tenant_id = 'tenant_1234';
Given that there are many possible combinations of columns in these search queries, the response time is currently very high. To address this, I am considering using compound indexes. However, I am concerned that this will slow down the write operations.
My Question:
What are some effective strategies to optimize these search queries without significantly impacting the performance of write operations? Any suggestions or best practices would be greatly appreciated.
2
Answers
One of the challenging things about SQL and compound indexes is how hard it is to generalize rules about creating those indexes. In almost every application that grows and gains new data, it is necessary to routinely assess which queries are slowing down and create indexes to support them. This may need to be done monthly, or even weekly in a fast-growing app. Setting up the right indexes often is not something a developer can do prior to deployment and then forget.
Let’s take a look at the query you showed us.
A BTREE index on
(name, tenant_id)
or(tenant_id, name)
will help this particular query. BTREE indexes oncompany
ordescription
won’t because you use the notoriouscolumn LIKE '%searchterm%'
performance antipattern (leading%
) there. You can, in PostgreSQL, accelerate the antipattern with a GIN index in place of BTREE, but that’s a longer discussion.It may be that other query filter patterns are also helped by the two indexes I mentioned. If either
name
ortenant_id
is selective enough they may help a lot.The insertion overhead of indexes usually gets a bad rap. Unless your insertion rate is very high and you have many indexes, you probably should not worry about this.
Try to use an INCLUDE index like this one :