skip to Main Content

I have a table and a stored procedure like following,

CREATE TABLE `inspect_call` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `task_id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `cc_number` varchar(63) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  `created_at` bigint(20) unsigned NOT NULL DEFAULT '0',
  `updated_at` bigint(20) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `task_id` (`task_id`)
) ENGINE=InnoDB AUTO_INCREMENT=234031 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci 

CREATE PROCEDURE inspect_proc(IN task bigint,IN number varchar(63))
INSERT INTO inspect_call(task_id,cc_number) values (task, number)

I had assumed that calling the stored procedure would be (much) faster than just calling insert. But to my surprised that is NOT the case at all. When I insert 10000 rows records, the insert command takes around 4 minutes while the stored procedure takes around 15 minutes.

I have run the test many times to confirm that. The MySQL server is not a high end server but I don’t understand why calling the stored procedure is much slower.

#using mysql-connector-python 8.0.31
command = ("INSERT INTO inspect_call (task_id,cc_number)"
           "VALUES (%s, %s)")
for i in range(rows): 
    cursor.execute(command, (task_id,f"{cc}{i}"))
    # cursor.callproc("inspect_proc", (task_id,f"{cc}{i}"))
cnx.commit()

BTW, I read some articles saying I can set innodb_flush_log_at_trx_commit = 2 to improve the insert speed but I don’t plan to do that.

— update —

From the answers I got I tried bulk insert(executemany) to see if there any improvement, but to my surprise there isn’t.

cursor = cnx.cursor(buffered=True)
for i in range(int(rows/1000)):
    data = []
    for j in range(1000):
        data.append((task_id,f"{cc}{i*1000+j}"))
    cursor.executemany(command,data)
 cnx.commit()

 # no improvement compared to 

 cursor = cnx.cursor()
 for i in range(rows):
    cursor.execute(command, (task_id,f"{cc}{i}"))

I tried many times (also tried 100 record for one executemany shot) and find their performances are basically the same.

Why is that ?

— update 2 —

I finally figure out why insert is so slow! Because I run the script from my laptop and access the database from its external host name. Once I uploaded the script to the server and access the DB from inside the intranet, it is much faster. Inserting 10000 records takes around 3 to 4 seconds; inserting 100,000 records takes around 36 seconds. I did not network can cause such a difference!

BUT executemany didn’t improve the performance in my case though.

2

Answers


  1. Your example won’t give credit to stored procedure because it won’t use any advantages of stored procedure.

    Main advantages of stored procedures are :

    • it’s compiled
    • it saves network exchanges (as computations operate on the server side)

    Imagine you have a logic enough complex not to be operated by UPDATE and you’d like to operate e.g. in Python, it requires :

    • select rows -> network traffic [server -> client]
    • update rows -> quite slow : Python is interpreted, maybe even slower if you use an ORM like SQLAlchemy (objets have to be created in memory)
    • send back updated rows -> network traffic [client -> server]

    Imagine the same example implemented with a stored procedure.
    In that kind of example chances are that the stored procedure really shines.

    In your example you don’t have any logic but just insert rows.
    It’s an I/O bound use case. No or little gain to have a compiled procedure.
    And you’ll have as many network exchanges as if you used INSERT.
    Whatever way rows have to be sent to the server.
    Also no gain in the network traffic amount.

    In your example maybe bulk insert could help reaching best performances.

    Login or Signup to reply.
  2. MySQL is unlike many other engines in that ordinary statements are reasonably fast — and wrapping in a stored procedure may add more overhead than it saves.

    You want faster? Batch the rows into a single INSERT (or, if there is a huge list, break it into clumps of 1000).

    See executemany().

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search