skip to Main Content

I have a PostgreSQL table named users which contains between 20 to 30 million records. The table has the following columns: id, tenant_id, name, company, location, and description.

I need to perform generic search queries on this table with various combinations of columns. Here is an example of such a query:

SELECT * 
FROM users 
WHERE name = 'raju' 
  AND company LIKE '%kg%' 
  AND description LIKE '%ght%' 
  AND tenant_id = 'tenant_1234';

Given that there are many possible combinations of columns in these search queries, the response time is currently very high. To address this, I am considering using compound indexes. However, I am concerned that this will slow down the write operations.

My Question:

What are some effective strategies to optimize these search queries without significantly impacting the performance of write operations? Any suggestions or best practices would be greatly appreciated.

2

Answers


  1. One of the challenging things about SQL and compound indexes is how hard it is to generalize rules about creating those indexes. In almost every application that grows and gains new data, it is necessary to routinely assess which queries are slowing down and create indexes to support them. This may need to be done monthly, or even weekly in a fast-growing app. Setting up the right indexes often is not something a developer can do prior to deployment and then forget.

    Let’s take a look at the query you showed us.

    WHERE name = 'raju' 
      AND company LIKE '%kg%' 
      AND description LIKE '%ght%' 
      AND tenant_id = 'tenant_1234'
    

    A BTREE index on (name, tenant_id) or (tenant_id, name) will help this particular query. BTREE indexes on company or description won’t because you use the notorious column LIKE '%searchterm%' performance antipattern (leading %) there. You can, in PostgreSQL, accelerate the antipattern with a GIN index in place of BTREE, but that’s a longer discussion.

    It may be that other query filter patterns are also helped by the two indexes I mentioned. If either name or tenant_id is selective enough they may help a lot.

    The insertion overhead of indexes usually gets a bad rap. Unless your insertion rate is very high and you have many indexes, you probably should not worry about this.

    Login or Signup to reply.
  2. Try to use an INCLUDE index like this one :

    CREATE INDEX X001 ON users (name, tenant_id) INCLUDE (company, description);
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search