My PHP code iterates a big MySQL table using PDO and an unbuffered connection: SELECT ID,X,Y FROM tbl WHERE Z IS NULL
.
For every row, it updates a field Z with the result of op(X,Y)
. Only rows where Z is unvalued are processed. The update is performed through a separate db connection, with the following query: UPDATE tbl SET Z=? WHERE X=? AND Y=?
. Same (X,Y) have same Z value, so I found updating multiple rows faster than updating the current row only (it would be UPDATE tbl SET Z=? WHERE ID=?
). Z is calculated by PHP calling external web service, so I couldn’t rewrite it as stored procedure using cursors. I’d like the unbuffered query to be sensitive to changes, so the next row returned isn’t a row already processed (Z valued). Lowering isolation levels or inserting Z in a new table, joined to tbl
in the select query, didn’t help.
There is a way to do that?
This was my PHP code:
$stmt = $db_unbuf->prepare('SELECT id,x,y,z FROM tbl');
$stmt->execute();
while($r = $stmt->fetch()) {
$z = op($r->x, $r->y);
$stmt2 = $db->prepare('UPDATE tbl SET z=? WHERE x=? AND y=?');
$stmt2->execute([$z,$x,$y]);
};
I managed to achieve the desired result in two ways:
1- redoing the query with LIMIT 0,1 after every UPDATE:
$stmt = $db_unbuf->prepare('SELECT id,x,y,z FROM tbl LIMIT 0,1');
$stmt->execute();
while($r = $stmt->fetch()) {
$z = op($r->x, $r->y);
$stmt2 = $db->prepare('UPDATE tbl SET z=? WHERE x=? AND y=?');
$stmt2->execute([$z,$x,$y]);
$stmt->execute(); // added
};
2- using GROUP BY (or SELECT DISTINCT)
$stmt = $db_unbuf->prepare('SELECT id,x,y,z FROM tbl GROUP BY X,Y');
$stmt->execute();
while($r = $stmt->fetch()) {
$z = op($r->x, $r->y);
$stmt2 = $db->prepare('UPDATE tbl SET z=? WHERE x=? AND y=?');
$stmt2->execute([$z,$x,$y]);
};
2
Answers
what I would do:
create table temptable as (
ID,
z columnz)
insert the record using your modified query and the web server, then only update the original table all at once at the end using the following logic:
update originaltable o
set z=(select z from temptable t where o.ID=t.ID)
where Z is null and exists (select z from temptable t where o.ID=t.ID);
drop table temptable;
Simple:
Add that clause on to verify that at the time of the update z is still NULL.