skip to Main Content

I need to speed up an import in MySQL.
Let’s assume that i have an external table "external.prices"

articleNumber price regionNumber
"A0000000001" 1.90 "R1"
"A0000000001" 1.99 "R2"
"A0000000002" 4.99 "R1"

Internally i have the following tables

"internal.articles"

id articleNumber
1 "A0000000001"
2 "A0000000002"

"internal.regions"

id regionNumber
1 "R1"
2 "R2"

"internal.prices" (empty)

articleID regionID price

Now i need to fill the "internal.prices" table from "external.prices".

In my thoughts the fastest way to import was:

  1. Also adding column "articleNumber" and "regionNumber" to internal.prices.
    like this:
articleID regionID price articleNumber regionNumber
  1. Then directly inserting the prices with this insert:
    INSERT INTO internal.prices
    (articleNumber, regionNumber, price)
    (SELECT articleNumber, regionNumber, price FROM external.prices)
  1. Then updating the articleID and regionID whith this update:
    UPDATE internal.prices p
    SET p.articleID = (SELECT id 
                        FROM internal.articles 
                        WHERE articleNumber = p.articleNumber ),
        p.regionID = (SELECT id 
                        FROM internal.regions 
                        WHERE regionNumber = p.regionNumber )

Turns out the first insert is extremly fast, but updating the articleID and regionID is very very slow especially for a large number of prices.
Note: I already have indexes on articleNumber and regionNumber in all tables.

What is the fastest way to do this?

2

Answers


  1. One query can do this everything (except adding columns, of course):

    INSERT INTO internal.prices (articleID, articleNumber, regionID, regionNumber, price)
    SELECT a.id, articleNumber, r.id, regionNumber, p.price 
    FROM external.prices p
    JOIN internal.articles a USING (articleNumber)
    JOIN internal.regions r USING (regionNumber)
    
    Login or Signup to reply.
  2. If you are replacing the entire table, then there is a better way: Load the new table, RENAME TABLE, then DROP TABLE.

    If the incoming data is partial, then use IODKU rather than UPDATE. This automatically inserts or updates in a single step.

    (Your "3" looks unworkable.")

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search