skip to Main Content

I am trying to execute the same SQL but with different values for the where clause. One query is taking significantly longer time to process than the other. I have also observed that the execution plan for the two queries is different too,

Query1 and Execution Plan:

explain analyze
select t."postal_code"
from dev."postal_master" t
   left join dev."premise_master" f
      on t."primary_code" = f."primary_code"
         and t."name" = f."name"
         and t."final_code" = f."final_code"
where 1 = 1  and t."region" = 'US' 
  and t."name" = 'UBQ' 
  and t."accountModCode" = 'LTI' 
  and t."modularity_code" = 'PHA'
group by t."postal_code", t."modularity_code", t."region",
      t."feature", t."granularity"

Group  (cost=4.19..4.19 rows=1 width=38) (actual time=76411.456..76414.348 rows=11871 loops=1)
  Group Key: t."postal_code", t."modularity_code", t."region", t."feature", t.granularity
  ->  Sort  (cost=4.19..4.19 rows=1 width=38) (actual time=76411.452..76412.045 rows=11879 loops=1)
        Sort Key: t."postal_code", t."feature", t.granularity
        Sort Method: quicksort  Memory: 2055kB
        ->  Nested Loop Left Join  (cost=0.17..4.19 rows=1 width=38) (actual time=45.373..76362.219 rows=11879 loops=1)
              Join Filter: (((t."name")::text = (f."name")::text) AND ((t."primary_code")::text = (f."primary_code")::text) AND ((t."final_code")::text = (f."final_code")::text))
              Rows Removed by Join Filter: 150642887
              ->  Index Scan using idx_postal_code_source on postal_master t  (cost=0.09..2.09 rows=1 width=72) (actual time=36.652..154.339 rows=11871 loops=1)
                    Index Cond: (("name")::text = 'UBQ'::text)
                    Filter: ((("region")::text = 'US'::text) AND (("accountModCode")::text = 'LTI'::text) AND (("modularity_code")::text = 'PHA'::text))
                    Rows Removed by Filter: 550164
              ->  Index Scan using idx_postal_master_source on premise_master f  (cost=0.08..2.09 rows=1 width=35) (actual time=0.016..3.720 rows=12690 loops=11871)
                    Index Cond: (("name")::text = 'UBQ'::text)
Planning Time: 1.196 ms
Execution Time: 76415.004 ms

Query2 and Execution plan:

explain analyze
select t."postal_code"
from dev."postal_master" t
   left join dev."premise_master" f
      on t."primary_code" = f."primary_code"
         and t."name" = f."name"
         and t."final_code" = f."final_code"
where 1 = 1  and t."region" = 'DE' 
  and t."name" = 'EME' 
  and t."accountModCode" = 'QEW' 
  and t."modularity_code" = 'NFX'
group by t."postal_code", t."modularity_code", t."region",
         t."feature", t."granularity"

Group  (cost=50302.96..50426.04 rows=1330 width=38) (actual time=170.687..184.772 rows=8230 loops=1)
  Group Key: t."postal_code", t."modularity_code", t."region", t."feature", t.granularity
  ->  Gather Merge  (cost=50302.96..50423.27 rows=1108 width=38) (actual time=170.684..182.965 rows=8230 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Group  (cost=49302.95..49304.62 rows=554 width=38) (actual time=164.446..165.613 rows=2743 loops=3)
              Group Key: t."postal_code", t."modularity_code", t."region", t."feature", t.granularity
              ->  Sort  (cost=49302.95..49303.23 rows=554 width=38) (actual time=164.444..164.645 rows=3432 loops=3)
                    Sort Key: t."postal_code", t."feature", t.granularity
                    Sort Method: quicksort  Memory: 550kB
                    Worker 0:  Sort Method: quicksort  Memory: 318kB
                    Worker 1:  Sort Method: quicksort  Memory: 322kB
                    ->  Nested Loop Left Join  (cost=1036.17..49297.90 rows=554 width=38) (actual time=2.143..148.372 rows=3432 loops=3)
                          ->  Parallel Bitmap Heap Scan on territory_postal_mapping t  (cost=1018.37..38323.78 rows=554 width=72) (actual time=1.898..11.849 rows=2743 loops=3)
                                Recheck Cond: ((("accountModCode")::text = 'QEW'::text) AND (("region")::text = 'DE'::text) AND (("name")::text = 'EME'::text))
                                Filter: (("modularity_code")::text = 'NFX'::text)
                                Rows Removed by Filter: 5914
                                Heap Blocks: exact=2346
                                ->  Bitmap Index Scan on territorypostal__source_region_mod  (cost=0.00..1018.31 rows=48088 width=0) (actual time=4.783..4.783 rows=25973 loops=1)
                                      Index Cond: ((("accountModCode")::text = 'QEW'::text) AND (("region")::text = 'DE'::text) AND (("name")::text = 'EME'::text))
                          ->  Bitmap Heap Scan on premise_master f  (cost=17.80..19.81 rows=1 width=35) (actual time=0.047..0.048 rows=1 loops=8230)
                                Recheck Cond: (((t."primary_code")::text = ("primary_code")::text) AND ((t."final_code")::text = ("final_code")::text))
                                Filter: ((("name")::text = 'EME'::text) AND ((t."name")::text = ("name")::text))
                                Heap Blocks: exact=1955
                                ->  BitmapAnd  (cost=17.80..17.80 rows=1 width=0) (actual time=0.046..0.046 rows=0 loops=8230)
                                      ->  Bitmap Index Scan on premise_master__accountprimarypostal  (cost=0.00..1.95 rows=105 width=0) (actual time=0.008..0.008 rows=24 loops=8230)
                                            Index Cond: ((t."primary_code")::text = ("primary_code")::text)
                                      ->  Bitmap Index Scan on premise_master__accountfinalterritorycode  (cost=0.00..15.80 rows=1403 width=0) (actual time=0.065..0.065 rows=559 loops=4568)
                                            Index Cond: ((t."final_code")::text = ("final_code")::text)
Planning Time: 1.198 ms
Execution Time: 185.197 ms

I am aware that there will be different number of rows depending on the where condition but is that the only reason for the different execution plan. Also, how can I improve the performance of the first query.

2

Answers


  1. The first line of each EXPLAIN ANALYZE output suggests that the planner only expected 1 row from the first query, while it expected 1130 from the second, so that’s probably why it chose a less efficient query plan. That usually means table statistics aren’t up to date, and when they were last run there weren’t many rows that would’ve matched the first query (maybe the data was being loaded in alphabetical order?). In this case the fix is to execute an ANALYZE dev."postal_master"query to refresh the statistics.

    You could also try removing the GROUP BY clause entirely (if your tooling allows). I could be misreading but it doesn’t look like it’s affecting the output much. If that results in unwanted duplicates you can use select distinct t.postal_code instead of the group by.

    Login or Signup to reply.
  2. The estimates are totally wrong for the first query, so it is no surprise that PostgreSQL picks a bad plan. Try these measures one after the other and see if they help:

    1. Collect statistics:

      ANALYZE premise_master, postal_master;
      
    2. Calculate more precise statistics:

      ALTER TABLE premise_master ALTER name SET statistics 1000;
      ALTER TABLE postal_master ALTER name SET statistics 1000;
      ANALYZE premise_master, postal_master;
      

    The estimates in the first query are off in such a bad way that I suspect that there is an exceptional problem, like an upgrade with pg_upgrade where you forgot to run ANALYZE afterwards, or you are wiping the database statistics with pg_stat_reset().

    If that is not the case, and a simple ANALYZE of the tables did the trick, the cause of the problem must be that autoanalyze does not run often enough on these tables. You can tune autovacuum to do that more often with a statement like this:

    ALTER TABLE premise_master SET (autovacuum_analyze_scale_factor = 0.01);
    

    That would make PostgreSQL collect statistics whenever 1% of the table has changed.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search