skip to Main Content

We have a table having around 10 million records and we are trying to update some columns using the id(primary key) in the where clause.

UPDATE  table_name SET column1=1, column2=0,column3='2022-10-30' WHERE id IN(1,2,3,4,5,6,7,......etc);

Scenario 1: when there are 3000 or fewer ids in the IN clause and if I try for EXPLAIN, then the ‘possible_keys’ and ‘key’ show the PRIMARY, and the query gets executed very fast.

Scenario 2: when there are 3000 or more ids(up to 30K) in the IN clause and if I try for EXPLAIN, then the ‘possible_keys’ shows NULL and the ‘key’ shows the PRIMARY and the query runs forever. If I use FORCE INDEX(PRIMARY) then the ‘possible_keys’ and the ‘key’ shows the PRIMARY and the query gets executed very fast.

Scenario 3: when there are more than 30k ids in the IN clause and even if I use FORCE INDEX(PRIMARY), the ‘possible_keys’ shows NULL, and the ‘key’ shows the PRIMARY and the query runs forever.

I believe the optimizer is going for a full table scan instead of an index scan. Can we make any change such that the optimizer goes for an index scan instead of a table scan? Please suggest if there are any parameter changes required to overcome this issue.

The MySQL version is 5.7

2

Answers


  1. As far as I know you need to just provide an ad-hoc table with all the ids and join table_name from it:

    update (select 1 id union select 2 union select 3) ids
    join table_name using (id) set column1=1, column2=0, column3='2022-10-30';
    

    In mysql 8 you can use a values table constructor which is a little more terse (omit "row" for mariadb, e.g. values (1),(2),(3)):

    update (select null id where 0 union all values row(1),row(2),row(3)) ids
    join table_name using (id) set column1=1, column2=0, column3='2022-10-30';
    

    fiddle

    Login or Signup to reply.
  2. When UPDATEing a significant chunk of a table wit all the same update values, I see a red flag.

    Do you always update the same set of rows? Could that info be in a smaller separate table that you JOIN to?

    Or may some other structural schema change that focuses on helping the Updates be faster?

    If you must have a long IN list, I suggest doing 100 at a time. And don’t try to COMMIT all 3000+ in the same transaction. (Committing in chunks mak violate some business logic, so you may not want to do such.)

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search