skip to Main Content

I know there are a lot related questions with many answers, but I have a bit of a more nuanced question. I have been doing reading on different insert techniques for mass records, but are there limits on how big a query insert can be? Can the same technique be used for REPLACE and INSERT ...ON DUPLICATE KEY UPDATE ... ? Is there a faster method?

Table:

+-----------+-------------+------+-----+---------+----------------+
| Field     | Type        | Null | Key | Default | Extra          |
+-----------+-------------+------+-----+---------+----------------+
| a         | int(11)     | NO   | PRI | NULL    | auto_increment |
| b         | int(11)     | YES  |     | NULL    |                |
| c         | int(11)     | YES  |     | NULL    |                |

#1

1) "INSERT INTO TABLE COLUMNS (a,b,c) values (1,2,3);"
2) "INSERT INTO TABLE COLUMNS (a,b,c) values (5,6,7);"
3) "INSERT INTO TABLE COLUMNS (a,b,c) values (8,9,10);"
 ...
10,000) "INSERT INTO TABLE COLUMNS (a,b,c) values (30001,30002,30003);"

or

#2 – should be faster, but is there a limit?

"INSERT INTO TABLE COLUMNS (a,b,c) values (1,2,3),(4,5,6),(8,9,10)....(30001,30002,30003)" ;

From a scripting perspective (PHP), using #2, is it better to loop through and queue up 100 entries (1000 times)…or a 1000 entries (100 times), or just all 10,000 at once? Could this be done with 100,000 entries?

Can the same be used with REPLACE:

"REPLACE INTO TABLE (a, b, c) VALUES(1,2,3),(4,5,6)(7,8,9),...(30001,30002,30003);"

Can it also be used with INSERT ON DUPLICATE?

INSERT INTO TABLE (a, b, c) VALUES(1,2,3),(4,5,6),(7,8,9),....(30001,30002,30003) ON DUPLICATE KEY UPDATE (b=2,c=3)(b=5,c=6),(b=8,c=9),....(b=30002,c=30003) ?

For any and all of the above (assuming the replace/on duplicate are valid), are there faster methods to achieve the inserts?

2

Answers


  1. The length of any SQL statement is limited by a MySQL option called max_allowed_packet.

    The syntax of INSERT allows you to add an unlimited number of tuples after the VALUES clause, but the total length of the statement from INSERT to the last tuple must still be no more than the number of bytes equal to max_allowed_packet.

    Regardless of that, I have found that LOAD DATA INFILE is usually significantly faster than any INSERT syntax. It’s so much faster, that you might even find it faster to write your tuples to a temporary CSV file and then use LOAD DATA INFILE on that CSV file.

    You might like my presentation comparing different bulk-loading solutions in MySQL: Load Data Fast!

    Login or Signup to reply.
  2. #1 (single-row inserts) — Slow. A variant is INSERT IGNORE — beware: it burns AUTO_INCREMENT ids.

    #2 (batch insert) — Faster than #1 by a factor of 10. But do the inserts in batches of no more than 1000. (After that, you are into "diminishing returns" and may conflict with other activities.

    #3 REPLACE — Bad. It is essentially a DELETE plus an INSERT. Once IODKU was added to MySQL, I don’t think there is any use for REPLACE. All the old AUTO_INCREMENT ids will be tossed and new ones created.

    #4 IODKU (Upsert) — [If you need to test before Insert.] It can be batched, but not the way you presented it. (There is no need to repeat the b and c values.)

    INSERT INTO (
    INSERT INTO TABLE (a, b, c)
         VALUES(1,2,3),(4,5,6),(7,8,9),....(30001,30002,30003)
         ON DUPLICATE KEY UPDATE
             b = VALUES(b),
             c = VALUES(c);
    

    Or, in MySQL 8.0, the last 2 lines are:

             b = NEW.b,
             c = NEW.c;
    

    IODKU also burns ids.

    MySQL LOAD DATA INFILE with ON DUPLICATE KEY UPDATE discusses a 2-step process of LOAD + IODKU. Depending on how complex the "updates" are, 2+ steps may be your best answer.

    #5 LOAD DATA — as Bill mentions, this is a good way if the data comes from a file. (I am dubious about its speed if you also have to write the data to a file first.) Be aware of the usefulness of @variables to make minor tweaks as you do the load. (Eg, STR_TO_DATE(..) to fix a DATE format.)

    #6 INSERT ... SELECT ...; — If the data is already in some other table(s), you may as well combine the Insert and Select. This works for IODKU, too.

    As a side note, if you need to get AUTO_INCREMENT ids of each batched row, I recommend some variant on the following. It is aimed at batch-normalization of id-name pairs that might already exist in the mapping table. Normalization

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search