Why does my PostgreSQL query improve when I disabled nested loop and what to do about it?

DubravkoPetrovic
August 27, 2023
204 views
0 votes
2 Answers

The database used is PostgreSQL. I have a query that combines two tables and does a where search on some category and then does order by by some property. The table products has a 10+ million products. If I put a random name on category name that doesn’t exist the query should return 0 rows and it does but the problem is that the query becomes super slow and starts executing for 80+ seconds when used with Order By but when I remove order by it is fast and only when I query for the category that doesn’t exist. Planner decides to go through all the rows and it becomes super slow.

Query:

explain analyze SELECT p.* FROM products p
INNER JOIN product_categories pc ON p.id = pc.product_id
WHERE pc.category_id = 'asf'
ORDER BY p.updated_at DESC LIMIT 21;

    Limit  (cost=1001.02..26849.88 rows=21 width=240) (actual time=35099.387..35396.240 rows=0 loops=1)
    ->  Gather Merge  (cost=1001.02..6249039.48 rows=5076 width=240) (actual time=35099.385..35396.237 rows=0 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Nested Loop  (cost=1.00..6247453.56 rows=2115 width=240) (actual time=35061.491..35061.492 rows=0 loops=3)
            ->  Parallel Index Scan Backward using idx_products_updated_at on products p  (cost=0.43..2596897.52 rows=5482982 width=240) (actual time=0.057..4123.748 rows=4563924 loops=3)
            ->  Index Scan using idx_product_categories_product_id on product_categories pc  (cost=0.56..0.66 rows=1 width=8) (actual time=0.007..0.007 rows=0 loops=13691771)
                Index Cond: (product_id = p.id)
                Filter: (category_id = 'asf'::text)
                Rows Removed by Filter: 1
    Planning Time: 1.157 ms
    Execution Time: 35396.346 ms

I tried to set SET enable_nestloop = off; and it started working perfectly again, but I don’t know if it’s a good thing to set this to off and if not how to avoid this problem? It only happens when I use a made up category it goes crazy on the speed, but when the category exists it is fast. The order by is the culprit as when I remove it, it starts going fast.

Answers

- MaimoonaAbid
- August 27, 2023 at 5:40 pm
- 0 votes
0
Try using EXISTS subquery, if you use this query then there will be no need of direct joining of the tables or filtering by category.
Try the following code;
```
EXPLAIN ANALYZE
SELECT p.*
FROM products p
WHERE EXISTS (
    SELECT 1
    FROM product_categories pc
    WHERE pc.product_id = p.id
    AND pc.category_id = 'asf'
)
ORDER BY p.updated_at DESC
LIMIT 21;
```
Login or Signup to reply.

- FrankHeikens
- August 27, 2023 at 6:01 pm
- 0 votes
0
Since you didn’t share the DDL for both tables and their indexes, we can only guess that there isn’t an index on category_id. And most likely you benefit most from a combination with product_id:
```
CREATE INDEX idx_product_categories_product_id_category_id
ON product_categories (product_id, category_id);
```
Give it a try, and please share the new query plan.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.