skip to Main Content

After various tests, I saw huge difference in execution time for those similar queries :

select id from my_table where creation_date > (current_timestamp - interval '2 days');  
-- 9s Plan: partition pruning parallel scan of 5 partitions

select id from my_table where creation_date > (select current_timestamp - interval '2 days');
-- 4.5s Plan: seq scan of all partitions

select id from my_table where creation_date > (localtimestamp - interval '2 days');     
-- 7s Plan: partition pruning parallel scan of the 5 last partitions

select id from my_table where creation_date > (select localtimestamp - interval '2 days'); 
-- 2.5s Plan: seq scan of all partitions

select id from my_table where creation_date > '2022-11-23 11:20'; 
-- 2.5s Plan: partition pruning parallel scan of the 5 last partitions

The creation_date is partition key of type timestamp (without timezone). I guess there’s something with type conversion.
When there’s a select in the where clause, the plan will scan all partitions. It behave like it can’t guess the outcome of the select. But it’s way faster than without select.
When there’s no select, the plan prune the partitions not required, and do seqscan only on the remaining but it is way slower.

Then I launch the same queries on same table, but through a foreign table from another database (on same instance for tests)

select id from my_foreign_table where creation_date > (current_timestamp - interval '2 days'); 
-- too long (more than 10min)

select id from my_foreign_table where creation_date > (select current_timestamp - interval '2 days');
-- too long (more than 10min)

select id from my_foreign_table where creation_date > (localtimestamp - interval '2 days');
-- too long (more than 10min)

select id from my_foreign_table where creation_date > (select localtimestamp - interval '2 days');
-- 2.5s 

select id from my_foreign_table where creation_date > '2022-11-23 11:20'; 
-- 2.6s 

Can someone explain why there is this much difference between those queries ? What’s going on under the hood ?

Execution plan for queries without select in where clause (partition pruning, parallel scan):

QUERY PLAN                                                                                                     |
---------------------------------------------------------------------------------------------------------------+
Gather  (cost=1000.00..62310996.53 rows=1244782 width=8)                                                       |
  Workers Planned: 8                                                                                           |
  ->  Parallel Append  (cost=0.00..62185518.33 rows=155598 width=8)                                            |
        Subplans Removed: 175                                                                                  |
        ->  Parallel Seq Scan on data_container_archive_2022_11 dca  (cost=0.00..820761.36 rows=169177 width=8)|
              Filter: (partition_key > (CURRENT_TIMESTAMP - '2 days'::interval))                               |
        ->  Parallel Seq Scan on data_container_archive_2022_12 dca_1  (cost=0.00..12.88 rows=55 width=8)      |
              Filter: (partition_key > (CURRENT_TIMESTAMP - '2 days'::interval))                               |
        ->  Parallel Seq Scan on data_container_archive_2023_01 dca_2  (cost=0.00..12.88 rows=55 width=8)      |
              Filter: (partition_key > (CURRENT_TIMESTAMP - '2 days'::interval))                               |
        ->  Parallel Seq Scan on data_container_archive_2023_02 dca_3  (cost=0.00..12.88 rows=55 width=8)      |
              Filter: (partition_key > (CURRENT_TIMESTAMP - '2 days'::interval))                               |
        ->  Parallel Seq Scan on data_container_archive_2023_03 dca_4  (cost=0.00..12.88 rows=55 width=8)      |
              Filter: (partition_key > (CURRENT_TIMESTAMP - '2 days'::interval))                               |

Execution plan for queries with select in where clause (all partitions, seq scan):

QUERY PLAN                                                                                             |
-------------------------------------------------------------------------------------------------------+
Append  (cost=0.01..88407378.48 rows=775958175 width=8)                                                |
  InitPlan 1 (returns $0)                                                                              |
    ->  Result  (cost=0.00..0.01 rows=1 width=8)                                                       |
  ->  Seq Scan on data_container_archive_2008_04 dca  (cost=0.00..1427.44 rows=13078 width=8)          |
        Filter: (partition_key > $0)                                                                   |
  ->  Seq Scan on data_container_archive_2008_05 dca_1  (cost=0.00..1392.58 rows=12895 width=8)        |
        Filter: (partition_key > $0)                                                                   |
  ->  Seq Scan on data_container_archive_2008_06 dca_2  (cost=0.00..376.65 rows=3457 width=8)          |
        Filter: (partition_key > $0)                                                                   |
  ->  Seq Scan on data_container_archive_2008_07 dca_3  (cost=0.00..1638.68 rows=14685 width=8)        |
        Filter: (partition_key > $0)                                                                   |
  ->  Seq Scan on data_container_archive_2008_08 dca_4  (cost=0.00..2127.53 rows=19001 width=8)        |
        Filter: (partition_key > $0)                                                                   |
  ->  Seq Scan on data_container_archive_2008_09 dca_5  (cost=0.00..3121.90 rows=27971 width=8)        |
        Filter: (partition_key > $0)                                                                   |
  ->  Seq Scan on data_container_archive_2008_10 dca_6  (cost=0.00..3147.60 rows=28576 width=8)        |
        Filter: (partition_key > $0)                                                                   |
  ->  Seq Scan on data_container_archive_2008_11 dca_7  (cost=0.00..4151.62 rows=37297 width=8)        |
        Filter: (partition_key > $0)                                                                   |
  ->  Seq Scan on data_container_archive_2008_12 dca_8  (cost=0.00..5287.16 rows=47551 width=8)        |
        Filter: (partition_key > $0)                                                                   |
  ->  Seq Scan on data_container_archive_2009_01 dca_9  (cost=0.00..7169.21 rows=64752 width=8)        |
        Filter: (partition_key > $0)                                                                   |
  ... 

Note : The names of the table and column are the real ones here.

2

Answers


  1. In the queries that compare with the result of a subquery, PostgreSQL doesn’t "pull up" the subquery, so it doesn’t know the result of the subquery by the time the execution starts. Consequently, it doesn’t prune partitions.

    Yes, the PostgreSQL optimizer might be made smarter and recognize that it could pull up the subqueries, but that would cost extra processing time that will hurt other queries.

    Login or Signup to reply.
  2. In short:

    localtimestamp is system clock value (thus no calculations) and

    current_timestamp is timezone-aware object, which requires look up of operating system’s time zone settings (at a penalty).

    Here is a nice write-up: https://www.commandprompt.com/education/postgresql-current_timestamp-vs-localtimestamp/

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search