In order to test various settings into my postgresql hot standby replication schema I need to reproduce a situation where the following error:
SQLSTATE[40001]: Serialization failure: 7 ERROR: canceling statement due to conflict with recovery
DETAIL: User query might have needed to see row versions that must be removed.
Therefore, I try to make 2 processes 1 that updates forever a boolean field with its opposite and one that reads the value from the replica.
The update script is this one (loopUpdate.php
):
$engine = 'pgsql';
$host = 'mydb.c3rrdbjxxkkk.eu-central-1.rds.amazonaws.com';
$database = 'dummydb';
$user = 'dummyusr';
$pass = 'dummypasswd';
$dns = $engine.':dbname='.$database.";host=".$host;
$pdo = new PDO($dns,$user,$pass, [
PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION
]);
echo "Continious update a field on et_store in order to cause new row version.".PHP_EOL;
while(true)
{
$pdo->exec("UPDATE mytable SET boolval= NOT boolval where id=52");
}
And the read script is the following (./loopRead.php
):
$engine = 'pgsql';
$host = 'mydb_replica.c3rrdbjxxkkk.eu-central-1.rds.amazonaws.com';
$database = 'dummydb';
$user = 'dummyusr';
$pass = 'dummypasswd';
$dns = $engine.':dbname='.$database.";host=".$host;
$pdo = new PDO($dns,$user,$pass, [
PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION
]);
echo "Continious update a field on et_store in order to cause new row version.".PHP_EOL;
while(true)
{
$value=$pdo->exec("SELECT id, boolval FROM mytable WHERE id=52");
var_dump($value);
echo PHP_EOL;
}
And I execute them in parallel:
# From one shell session
$ php ./loopUpdate.php
# From another one shell session
$ php ./loopRead.php
The mydb_replica.c3rrdbjxxkkk.eu-central-1.rds.amazonaws.com
is hot standby read replica of the mydb.c3rrdbjxxkkk.eu-central-1.rds.amazonaws.com
.
But I fail to make the loopRead.php
to fail with the error:
SQLSTATE[40001]: Serialization failure: 7 ERROR: canceling statement due to conflict with recovery
DETAIL: User query might have needed to see row versions that must be removed.
As far as I know the error I try to reproduce is because postgresql VACUUM
action is performed during an active read transaction on read replica that asks rather stale data. So how I can cause my select statement to select on stale versions of my row?
2
Answers
In order to cause your error you need to place a HUGE delay into your select query itself via a
pg_delay
postgresql function, therefore changing your query into:So on a single transaction you have a "heavy" query and maximizes the chances of causing a PostgreSQL serialization error.
Though the detail will differ:
In tat case try to reduce the pg_delay value from
1000000000
into10
.On the standby, set
max_standby_streaming_delay
to 0 andhot_standby_feedback
tooff
.Then start a transaction on the standby:
Then
DELETE
rows fromatable
andVACUUM (VERBOSE)
it on the primary server. Make sure some rows are removed.Then you should be able to observe a replication conflict.