skip to Main Content

I am investigating a performance issue with a Spring Boot application that uses Spring Batch 4.2.0 and Postgres 14.5 and ran into an interesting issue.

One of the top queries is the following. Explain plan shows that it executes a FTS(actually two).
The query Postgres side:

SELECT  SE.STEP_EXECUTION_ID, SE.STEP_NAME, SE.START_TIME, SE.END_TIME, SE.STATUS, SE.COMMIT_COUNT, SE.READ_COUNT, SE.FILTER_COUNT, SE.WRITE_COUNT, 
                SE.EXIT_CODE, SE.EXIT_MESSAGE, SE.READ_SKIP_COUNT, SE.WRITE_SKIP_COUNT, SE.PROCESS_SKIP_COUNT, SE.ROLLBACK_COUNT, SE.LAST_UPDATED, SE.VERSION, 
                JE.JOB_EXECUTION_ID, JE.START_TIME, JE.END_TIME, JE.STATUS, JE.EXIT_CODE, JE.EXIT_MESSAGE, JE.CREATE_TIME, JE.LAST_UPDATED, JE.VERSION 
  from N_BATCH_JOB_EXECUTION JE, N_BATCH_STEP_EXECUTION SE 
  where       
       SE.JOB_EXECUTION_ID in 
                     (SELECT JOB_EXECUTION_ID 
                        from N_BATCH_JOB_EXECUTION 
                       where JE.JOB_INSTANCE_ID = $1)     
   and SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID       
   and SE.STEP_NAME = $2 
   order by SE.START_TIME desc, SE.STEP_EXECUTION_ID desc

The explain plan:

Sort  (cost=154.08..154.11 rows=13 width=217)
  Sort Key: se.start_time DESC, se.step_execution_id DESC
  ->  Hash Join  (cost=69.71..153.83 rows=13 width=217)
        Hash Cond: (se.job_execution_id = je.job_execution_id)
        Join Filter: (SubPlan 1)
        ->  Bitmap Heap Scan on n_batch_step_execution se  (cost=4.49..88.54 rows=26 width=156)
              Recheck Cond: ((step_name)::text = 'notf0270f8e50fd95b44da8b34cb1ea829cc74'::text)
              ->  Bitmap Index Scan on n_batch_step_exec_step_name_job_exec_id_idx  (cost=0.00..4.48 rows=26 width=0)
                    Index Cond: ((step_name)::text = 'notf0270f8e50fd95b44da8b34cb1ea829cc74'::text)
        ->  Hash  (cost=45.10..45.10 rows=1610 width=77)
              ->  Seq Scan on n_batch_job_execution je  (cost=0.00..45.10 rows=1610 width=77)
        SubPlan 1
          ->  Result  (cost=0.00..45.10 rows=1610 width=8)
                One-Time Filter: (je.job_instance_id = 1)
                ->  Seq Scan on n_batch_job_execution  (cost=0.00..45.10 rows=1610 width=8)

This query didn’t seem quite right to me, tried to make it better, got something and decided to check the original. Found the query in Spring Batch source code(JdbcStepExecutionDao.java) and seems correct.
But it’s slightly different, enough to produce a different explain plan. Basically the subquery is not using an alias defined in the outer query(where JOB_INSTANCE_ID =… instead of **JE.**JOB_INSTANCE_ID = …)

  where       
       SE.JOB_EXECUTION_ID in 
                     (SELECT JOB_EXECUTION_ID 
                        from N_BATCH_JOB_EXECUTION 
                       where JOB_INSTANCE_ID = $1) 

Explain plan looks much better:

Sort  (cost=17.74..17.75 rows=1 width=217)
  Sort Key: se.start_time DESC, se.step_execution_id DESC
  ->  Nested Loop  (cost=0.59..17.73 rows=1 width=217)
        ->  Nested Loop  (cost=0.30..16.35 rows=1 width=77)
              ->  Index Scan using n_batch_job_execution_job_instance_id_idx on n_batch_job_execution  (cost=0.15..8.17 rows=1 width=8)
                    Index Cond: (job_instance_id = 1)
              ->  Index Scan using n_batch_job_execution_pkey on n_batch_job_execution je  (cost=0.15..8.17 rows=1 width=69)
                    Index Cond: (job_execution_id = n_batch_job_execution.job_execution_id)
        ->  Index Scan using n_batch_step_exe_idx on n_batch_step_execution se  (cost=0.29..1.37 rows=1 width=156)
              Index Cond: (job_execution_id = je.job_execution_id)
              Filter: ((step_name)::text = 'notf0270f8e50fd95b44da8b34cb1ea829cc74'::text)

The question I have is how did that query get re-written like that? Perhaps Postgres server does that, maybe the JDBC driver or some other layer?

2

Answers


  1. Chosen as BEST ANSWER

    thank you, Mahmoud, I agree with you. The explanation is quite simple in the end, 4.2.0.RELEASE contains the query that has this performance issue, and that's what the application uses, but I've checked the query in 4.2.x tree, my mistake. It looks like you've fixed the query with following commit: https://github.com/spring-projects/spring-batch/commit/47a6f019e3c657e1cc4625a9c5d142a4601c8907?diff=split


  2. The question I have is how did that query get re-written like that? Perhaps Postgres server does that, maybe the JDBC driver or some other layer?

    Spring Batch does not re-write queries. The query you see in the code is the one that is sent to your database.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search