I have a simple table that is as seen below:
Table "public.test"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
id | integer | | not null |
name | text | | |
Indexes:
"ss_pkey" PRIMARY KEY, btree (id)
The table currently has two rows:
id | name
----+---------
0 | Matthew
1 | Thomas
(2 rows)
Now I have two transactions running concurrently:
Transaction 1
BEGIN;
WITH locked_rows AS(
SELECT *
FROM TEST
FOR UPDATE
)
INSERT INTO TEST
VALUES
(1, 'Chris'),
(0, 'John')
ON CONFLICT(id) DO
UPDATE
SET name = excluded.name
WHERE test.name <> excluded.name
COMMIT;
Transaction 2
BEGIN;
WITH locked_rows AS(
SELECT *
FROM TEST
FOR UPDATE
)
INSERT INTO TEST
VALUES
(0, 'Bryan'),
(1, 'Steven')
ON CONFLICT(id) DO
UPDATE
SET name = excluded.name
WHERE test.name <> excluded.name
COMMIT;
The values to be inserted are purposefully inserted in reverse order between the transactions because I want to trigger a deadlock and see if it can be avoided by using SELECT...FOR UPDATE;
. Now I understand that changing the order will prevent the deadlock but I still want to understand what’s happening here.
When these two transactions are executed concurrently, they always deadlock with the error:
ERROR: deadlock detected
DETAIL: Process 29588 waits for ShareLock on transaction 255002; blocked by process 29010.
Process 29010 waits for ShareLock on transaction 255003; blocked by process 29588.
HINT: See server log for query details.
CONTEXT: while inserting index tuple (0,27) in relation "test"
SQL state: 40P01
but if I remove the CTE and simply break each transaction into two separate commands, e.g:
BEGIN;
SELECT * FROM TEST FOR UPDATE;
INSERT INTO TEST
VALUES
(1, 'Chris'),
(0, 'John')
ON CONFLICT(id) DO
UPDATE
SET name = excluded.name
WHERE test.name <> excluded.name
COMMIT;
and the same for the other transaction, then the deadlocks disappear and the second transaction simply waits for the first one to release the lock on the rows, which is what I expect to happen.
I would expect the same to happen when I use a CTE since the SELECT...FOR UPDATE
is still executed in the transaction, but that is not the case. I have also tried declaring the CTE AS MATERIALIZED
to see if that will make a difference but nothing changed.
From the Postgres docs, I see this:
If specific tables are named in a locking clause, then only rows coming from those tables are locked; any other tables used in the SELECT are simply read as usual. A locking clause without a table list affects all tables used in the statement. If a locking clause is applied to a view or sub-query, it affects all tables used in the view or sub-query. However, these clauses do not apply to WITH queries referenced by the primary query. If you want row locking to occur within a WITH query, specify a locking clause within the WITH query.
but I don’t really understand what it means and I think this is referring to the [ OF table_name [, ...] ]
optional argument in the FOR UPDATE
.
Is postgres completely ignoring the CTE because it is not used in subsequent queries? If that’s the case, then why do other commands such as DELETE
or UPDATE
work properly inside an unused CTE? Or is something else happening behind the scenes?
Note: I’m using PostgreSQL 15.2
2
Answers
both transactions are trying to acquire an
exclusive
lock on the same row at the same time.In the first transaction, you are using a
CTE
to select all rows from the test table and lock them withFOR UPDATE.
This means that the first transaction will acquire anexclusive
lock on all rows in the test table.In the second transaction, you are also using a CTE to select all rows from the test table. However, this CTE does not use the
FOR UPDATE
clause. This means that the second transaction will only acquire ashared lock on the rows
in the test table.When the second transaction tries to update a row that is already locked by the first transaction, a
deadlock
will occur.To avoid this deadlock, you can either:
FOR UPDATE
clause in both CTEsUsing
FOR UPDATE
clause in bothCTEs
single transaction to perform both the select and the update:
the second
CTE
is also using theFOR UPDATE
clause. However, thedeadlock
still occurs because the second transaction is trying to acquire anexclusive
lock on a row that is already locked by the first transaction.The reason why the second transaction is able to acquire a
shared loc
k on the rows in the test table is because the first transaction has already acquired an exclusive lock on those rows. This is because theFOR UPDATE
clause is ablocking lock.
This means that the second transaction will wait until the first transaction releases the exclusive lock on the rows before it can acquire a shared lock on those rows.However, when the second transaction tries to update a row that is already locked by the first transaction, a deadlock will occur because the second transaction is now trying to acquire an
exclusive
lock on the row. This will cause the first transaction to block, and the second transaction to block. This will create adeadlock
situation that cannot be resolved.To avoid this deadlock, you can either use the
FOR UPDATE
clause in both CTEs, or you can use a single transaction to perform both the select and the update.You are getting a deadlock because the CTE is never executed.
A CTE is only executed if it contains a DML statement or if it is used in the main statement. See the documentation:
Two notes about avoiding deadlocks with
SELECT ... FOR UPDATE
:unless you plan to delete the row or modify a key, you should use
FOR NO KEY UPDATE
to avoid excessive lockingtwo sessions running
SELECT ... FROM tab FOR NO KEY UPDATE
could still deadlock with each otherbecause of synchronized sequential scans (see the
synchronize_seqscans
parameter)if there are concurrent updates on the table, and the snapshots of both
SELECT
s see the rows in different orders