skip to Main Content

update 20M records in table and set col1 = col2 + col3 for each record using spring boot.

try (Connection connection = dataSource.getConnection()) {
            String updateQuery = "UPDATE table SET col1 = ? WHERE id = ?";
            PreparedStatement preparedStatement = connection.prepareStatement(updateQuery);

        for (TableEntity entity : tableEntities) {
            preparedStatement.setInt(1, calculate(offersLeadScore)); // Set new col1
            preparedStatement.setObject(2, entity.getId());
            preparedStatement.addBatch();
        }
        preparedStatement.executeBatch();`

The above approach is taking so much time to execute and update all records in table.

How to achieve faster execution using spring boot?

Tried JPA’s saveAll() and above approach. Could not optimise it.

Need some industry standard approach to update all records in one table in faster way.

2

Answers


  1. Try this approach, it works by using a Spring Boot service called TableService to effectively update a big dataset of 20 million records inside a database table. The updateCol1ForTable function contains the main operation. This method builds a SQL update query that, using the record’s id as a reference, computes the new value for col1 as the sum of col2 and col3 for each record.
    Here is the code;

    import org.springframework.beans.factory.annotation.Autowired;
    import org.springframework.jdbc.core.BatchPreparedStatementSetter;
    import org.springframework.jdbc.core.JdbcTemplate;
    import org.springframework.stereotype.Service;
    
    import java.sql.PreparedStatement;
    import java.sql.SQLException;
    import java.util.List;
    
    @Service
    public class TableService {
    
        private final JdbcTemplate jdbcTemplate;
    
        @Autowired
        public TableService(JdbcTemplate jdbcTemplate) {
            this.jdbcTemplate = jdbcTemplate;
        }
    
        public void updateCol1ForTable(List<TableEntity> tableEntities) {
            String updateQuery = "UPDATE your_table_name SET col1 = col2 + col3 WHERE id = ?";
            
            jdbcTemplate.batchUpdate(updateQuery, new BatchPreparedStatementSetter() {
                @Override
                public void setValues(PreparedStatement preparedStatement, int i) throws SQLException {
                    preparedStatement.setLong(1, tableEntities.get(i).getId());
                }
    
                @Override
                public int getBatchSize() {
                    return tableEntities.size();
                }
            });
        }
    }
    

    To efficiently update the records in batches, call this function from your service layer or controller.

    Hope it works 🙂

    Login or Signup to reply.
  2. Why? At best you are introducing a maintenance headache. Will every developer and user remember to appropriately update col1 whenever col2 and/or col3 is updated and every time a row is inserted. A much better solution is to perform the calculation on the select. It is a 1 time maintenance of the select. If for some reason you think you need to store the derived result then define col1 as a generated. First drop the column then re-add it: (demo here)

    alter table <your_table_name> drop col1; 
    alter table <your_table_name> 
          add col1 bigint
              generated always as (col2::bigint + col3::bigint) stored;
    

    The initial run/setting will not be fast. But the trade off is you have no maintenance and the value of col1 is automatically calculated when a row is added and whenever col2' and/or col3′ is updated. No additional maintenance required initially or later.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search