skip to Main Content

In order to test various settings into my postgresql hot standby replication schema I need to reproduce a situation where the following error:

SQLSTATE[40001]: Serialization failure: 7 ERROR:  canceling statement due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.

Therefore, I try to make 2 processes 1 that updates forever a boolean field with its opposite and one that reads the value from the replica.

The update script is this one (loopUpdate.php):

$engine = 'pgsql';
$host = 'mydb.c3rrdbjxxkkk.eu-central-1.rds.amazonaws.com';
$database = 'dummydb';
$user = 'dummyusr'; 
$pass = 'dummypasswd';
$dns = $engine.':dbname='.$database.";host=".$host; 

$pdo = new PDO($dns,$user,$pass, [
  PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION
]);

echo "Continious update a field on et_store in order to cause new row version.".PHP_EOL;

while(true)
{
       $pdo->exec("UPDATE mytable SET boolval= NOT boolval where id=52");
}

And the read script is the following (./loopRead.php):

$engine = 'pgsql';
$host = 'mydb_replica.c3rrdbjxxkkk.eu-central-1.rds.amazonaws.com';
$database = 'dummydb';
$user = 'dummyusr'; 
$pass = 'dummypasswd';
$dns = $engine.':dbname='.$database.";host=".$host; 

$pdo = new PDO($dns,$user,$pass, [
  PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION
]);

echo "Continious update a field on et_store in order to cause new row version.".PHP_EOL;

while(true)
{
    $value=$pdo->exec("SELECT id, boolval FROM mytable  WHERE id=52");
    var_dump($value);
    echo PHP_EOL;
}

And I execute them in parallel:

# From one shell session
$ php ./loopUpdate.php 
# From another one shell session
$ php ./loopRead.php 

The mydb_replica.c3rrdbjxxkkk.eu-central-1.rds.amazonaws.com is hot standby read replica of the mydb.c3rrdbjxxkkk.eu-central-1.rds.amazonaws.com.

But I fail to make the loopRead.php to fail with the error:

SQLSTATE[40001]: Serialization failure: 7 ERROR:  canceling statement due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.

As far as I know the error I try to reproduce is because postgresql VACUUM action is performed during an active read transaction on read replica that asks rather stale data. So how I can cause my select statement to select on stale versions of my row?

2

Answers


  1. Chosen as BEST ANSWER

    In order to cause your error you need to place a HUGE delay into your select query itself via a pg_delay postgresql function, therefore changing your query into:

    SELECT id, boolval, pg_sleep(1000000000) FROM mytable  WHERE id=52
    

    So on a single transaction you have a "heavy" query and maximizes the chances of causing a PostgreSQL serialization error.

    Though the detail will differ:

    DETAIL:  User was holding shared buffer pin for too long.
    

    In tat case try to reduce the pg_delay value from 1000000000 into 10.


  2. On the standby, set max_standby_streaming_delay to 0 and hot_standby_feedback to off.

    Then start a transaction on the standby:

    SELECT *, pg_sleep(10) FROM atable;
    

    Then DELETE rows from atable and VACUUM (VERBOSE) it on the primary server. Make sure some rows are removed.

    Then you should be able to observe a replication conflict.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search