I have performance issue on a query, potentiall wrongly written.
Dear all.
I have two tables.
One table "test_cdr" , with rougly 5M rows, contains three columns
- POINT_TARGET | Contains called phone numbers
- dest | should contain a destination identifier
- DURATION | the duration of the call
Another table "codes" contains two columns
- area_codes | contains a list of country prefixes
- dest | contains the destination identifier
Now I need to puplate the test_cdr.dest by:
- run every number in test_cdr.POINT_TARGET against the codes.area_codes
- identify the best matches codes.area_codes
- take the data inside codes.dest and copy it into test_cdr
like this, I can then sum the duration of each call per destination type.
I am, atm, using this query
UPDATE test_cdr AS main
JOIN (
SELECT c.POINT_TARGET, t.dest
FROM test_cdr c
JOIN codes t ON c.POINT_TARGET LIKE CONCAT(t.area_codes, '%')
WHERE c.dest IS NULL -- Only update rows where dest is NULL
LIMIT 1000 -- Adjust the batch size as needed
) AS matched ON main.POINT_TARGET = matched.POINT_TARGET
SET main.dest = matched.dest;
It only works with LIMIT and it returns roughly 10000 rows (every 20 seconds – pretty lame performance)
As soon as I remove LIMIT, the query runs but nothing is performed on the db (I see no disk activity, while with LIMIT there is disk activity).
Could you help me figure out the issue or maybe provide a better option?
Btw, I have used INDEXES.
Thank you.
2
Answers
after suggestion from Konteye I modified the query:
It is still very slow, but processing.
I recommend creating a procedure to paginate through the table. Here’s an example:
Be sure to test this procedure thoroughly before using it in a production environment.