I have a table and a stored procedure like following,
CREATE TABLE `inspect_call` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`task_id` bigint(20) unsigned NOT NULL DEFAULT '0',
`cc_number` varchar(63) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
`created_at` bigint(20) unsigned NOT NULL DEFAULT '0',
`updated_at` bigint(20) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `task_id` (`task_id`)
) ENGINE=InnoDB AUTO_INCREMENT=234031 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
CREATE PROCEDURE inspect_proc(IN task bigint,IN number varchar(63))
INSERT INTO inspect_call(task_id,cc_number) values (task, number)
I had assumed that calling the stored procedure would be (much) faster than just calling insert. But to my surprised that is NOT the case at all. When I insert 10000 rows records, the insert command takes around 4 minutes while the stored procedure takes around 15 minutes.
I have run the test many times to confirm that. The MySQL server is not a high end server but I don’t understand why calling the stored procedure is much slower.
#using mysql-connector-python 8.0.31
command = ("INSERT INTO inspect_call (task_id,cc_number)"
"VALUES (%s, %s)")
for i in range(rows):
cursor.execute(command, (task_id,f"{cc}{i}"))
# cursor.callproc("inspect_proc", (task_id,f"{cc}{i}"))
cnx.commit()
BTW, I read some articles saying I can set innodb_flush_log_at_trx_commit = 2
to improve the insert speed but I don’t plan to do that.
— update —
From the answers I got I tried bulk insert(executemany
) to see if there any improvement, but to my surprise there isn’t.
cursor = cnx.cursor(buffered=True)
for i in range(int(rows/1000)):
data = []
for j in range(1000):
data.append((task_id,f"{cc}{i*1000+j}"))
cursor.executemany(command,data)
cnx.commit()
# no improvement compared to
cursor = cnx.cursor()
for i in range(rows):
cursor.execute(command, (task_id,f"{cc}{i}"))
I tried many times (also tried 100 record for one executemany
shot) and find their performances are basically the same.
Why is that ?
— update 2 —
I finally figure out why insert is so slow! Because I run the script from my laptop and access the database from its external host name. Once I uploaded the script to the server and access the DB from inside the intranet, it is much faster. Inserting 10000 records takes around 3 to 4 seconds; inserting 100,000 records takes around 36 seconds. I did not network can cause such a difference!
BUT executemany
didn’t improve the performance in my case though.
2
Answers
Your example won’t give credit to stored procedure because it won’t use any advantages of stored procedure.
Main advantages of stored procedures are :
Imagine you have a logic enough complex not to be operated by UPDATE and you’d like to operate e.g. in Python, it requires :
Imagine the same example implemented with a stored procedure.
In that kind of example chances are that the stored procedure really shines.
In your example you don’t have any logic but just insert rows.
It’s an I/O bound use case. No or little gain to have a compiled procedure.
And you’ll have as many network exchanges as if you used INSERT.
Whatever way rows have to be sent to the server.
Also no gain in the network traffic amount.
In your example maybe
bulk insert
could help reaching best performances.MySQL is unlike many other engines in that ordinary statements are reasonably fast — and wrapping in a stored procedure may add more overhead than it saves.
You want faster? Batch the rows into a single
INSERT
(or, if there is a huge list, break it into clumps of 1000).See
executemany()
.