skip to Main Content

My PHP code iterates a big MySQL table using PDO and an unbuffered connection: SELECT ID,X,Y FROM tbl WHERE Z IS NULL.
For every row, it updates a field Z with the result of op(X,Y). Only rows where Z is unvalued are processed. The update is performed through a separate db connection, with the following query: UPDATE tbl SET Z=? WHERE X=? AND Y=?. Same (X,Y) have same Z value, so I found updating multiple rows faster than updating the current row only (it would be UPDATE tbl SET Z=? WHERE ID=?). Z is calculated by PHP calling external web service, so I couldn’t rewrite it as stored procedure using cursors. I’d like the unbuffered query to be sensitive to changes, so the next row returned isn’t a row already processed (Z valued). Lowering isolation levels or inserting Z in a new table, joined to tbl in the select query, didn’t help.

There is a way to do that?

This was my PHP code:

$stmt = $db_unbuf->prepare('SELECT id,x,y,z FROM tbl');
$stmt->execute();
while($r = $stmt->fetch()) {
    $z = op($r->x, $r->y);
    $stmt2 = $db->prepare('UPDATE tbl SET z=? WHERE x=? AND y=?');
    $stmt2->execute([$z,$x,$y]);
};

I managed to achieve the desired result in two ways:

1- redoing the query with LIMIT 0,1 after every UPDATE:

$stmt = $db_unbuf->prepare('SELECT id,x,y,z FROM tbl LIMIT 0,1');
$stmt->execute();
while($r = $stmt->fetch()) {
    $z = op($r->x, $r->y);
    $stmt2 = $db->prepare('UPDATE tbl SET z=? WHERE x=? AND y=?');
    $stmt2->execute([$z,$x,$y]);
    $stmt->execute(); // added
};

2- using GROUP BY (or SELECT DISTINCT)

$stmt = $db_unbuf->prepare('SELECT id,x,y,z FROM tbl GROUP BY X,Y');
$stmt->execute();
while($r = $stmt->fetch()) {
    $z = op($r->x, $r->y);
    $stmt2 = $db->prepare('UPDATE tbl SET z=? WHERE x=? AND y=?');
    $stmt2->execute([$z,$x,$y]);
};

2

Answers


  1. what I would do:

    create table temptable as (
    ID,
    z columnz)

    insert the record using your modified query and the web server, then only update the original table all at once at the end using the following logic:

    update originaltable o
    set z=(select z from temptable t where o.ID=t.ID)
    where Z is null and exists (select z from temptable t where o.ID=t.ID);

    drop table temptable;

    Login or Signup to reply.
  2. so the next row returned isn’t a row already processed

    Simple:

    UPDATE ... WHERE ...
        AND z IS NULL
    

    Add that clause on to verify that at the time of the update z is still NULL.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search