skip to Main Content

I have two tables T1 and T2. I’m doing a simple inner join

select t1.a, t1.b, t1.c 
from T1 t1 
inner join T2 t2 on t1.c = t2.c

Table T1 has got 2 million rows and T2 has got 4 million records

explain analyse of the query / execution time takes approximately 3 seconds.

I have indexes for the column c on T1 as well as on T2

What can I do to improve this or if there is any alternate way of writing this query? I do not need to select columns from T2.

appreciate any help.

2

Answers


  1. Depending on the requirements, table partitioning can be a solution.
    https://www.postgresql.org/docs/current/ddl-partitioning.html

    Login or Signup to reply.
  2. That heavily depends on the characteristics of the data and traffic on the tables. Before you share your table and index definitions along with a explain (analyze,buffers,verbose) output, guessing purely based on your mention of basic, single-column index aiding the operation: you could look into tuning some additional settings of those indexes. I’m assuming you’re already getting an index-only scan on t2 and a regular index scan on t1.

    drop table if exists t1;
    create table t1(a text, b text, c int);
    insert into t1 select gen_random_uuid(),gen_random_uuid(),random()*1e6
    from generate_series(1,2e6,1)_(n);
    create index t1_c_idx on t1(c);
    analyze t1;
    
    drop table if exists t2;
    create table t2(a text, b text, c int);
    insert into t2 select gen_random_uuid(),gen_random_uuid(),random()*1e7
    from generate_series(1,4e6,1)_(n);
    create index t2_c_idx on t2(c);
    analyze t2;
    
    explain (analyze,buffers,verbose)
    select t1.a, t1.b, t1.c 
    from T1 t1 
    inner join T2 t2 on t1.c = t2.c;
    
                                                                       QUERY PLAN
    -------------------------------------------------------------------------------------------------------------------------------------------------
     Merge Join  (cost=44.65..218570.14 rows=2744902 width=78) (actual time=2.904..2532.031 rows=797062 loops=1)
       Output: t1.a, t1.b, t1.c
       Merge Cond: (t1.c = t2.c)
       Buffers: shared hit=2857658
       ->  Index Scan using t1_c_idx on public.t1  (cost=0.43..153103.24 rows=2000000 width=78) (actual time=0.009..1716.516 rows=2000000 loops=1)
             Output: t1.a, t1.b, t1.c
             Buffers: shared hit=2003982
       ->  Index Only Scan using t2_c_idx on public.t2  (cost=0.43..315262.71 rows=3999961 width=4) (actual time=0.018..543.618 rows=851437 loops=1)
             Output: t2.c
             Heap Fetches: 851437
             Buffers: shared hit=853676
     Planning:
       Buffers: shared hit=12
     Planning Time: 0.194 ms
     JIT:
       Functions: 5
       Options: Inlining false, Optimization false, Expressions true, Deforming true
       Timing: Generation 0.411 ms, Inlining 0.000 ms, Optimization 0.199 ms, Emission 2.570 ms, Total 3.181 ms
     Execution Time: 2561.986 ms
    

    The 2.5s above is close to your current score. If you include the other two columns you’re fetching as payload on the index, you can get it to run index-only scans for both tables (finds everything in the indexes, doesn’t have to jump to actual tables) – in my case that gives a >4x speedup down to 0.6s:

    drop index t1_c_idx; 
    create index t1_c_idx on t1(c)include(a,b);
    analyze t1;
    drop index t2_c_idx; 
    create index t2_c_idx on t2(c)include(a,b);
    analyze t2;
    
    explain (analyze,buffers,verbose )
    select t1.a, t1.b, t1.c 
    from T1 t1 
    inner join T2 t2 on t1.c = t2.c;
    
                                                                        QUERY PLAN
    ---------------------------------------------------------------------------------------------------------------------------------------------------
     Merge Join  (cost=3.21..177965.47 rows=2549607 width=78) (actual time=1.162..536.684 rows=797062 loops=1)
       Output: t1.a, t1.b, t1.c
       Merge Cond: (t1.c = t2.c)
       Buffers: shared hit=299842
       ->  Index Only Scan using t1_c_idx on public.t1  (cost=0.43..122364.43 rows=2000000 width=78) (actual time=0.034..193.624 rows=2000000 loops=1)
             Output: t1.c, t1.a, t1.b
             Heap Fetches: 0
             Buffers: shared hit=22992
       ->  Index Only Scan using t2_c_idx on public.t2  (cost=0.43..244611.84 rows=3999961 width=4) (actual time=0.011..106.730 rows=851437 loops=1)
             Output: t2.c, t2.a, t2.b
             Heap Fetches: 0
             Buffers: shared hit=276850
     Planning:
       Buffers: shared hit=12
     Planning Time: 0.180 ms
     JIT:
       Functions: 3
       Options: Inlining false, Optimization false, Expressions true, Deforming true
       Timing: Generation 0.243 ms, Inlining 0.000 ms, Optimization 0.109 ms, Emission 0.940 ms, Total 1.292 ms
     Execution Time: 565.847 ms
    
    

    If your data is completely static or only refreshed/reloaded in full, you can also compact the indexes: by default they leave 10% empty space to accommodate incoming rows, which those tables don’t need.

    create index t1_c_idx on t1(c)include(a,b)with(fillfactor=100);
    

    If heap fetches are unavoidable or your payload is too big, you can consider cluster:

    cluster t1 using t1_c_idx;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search