skip to Main Content

I have a simple table that is as seen below:

Table "public.test"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 id     | integer |           | not null | 
 name   | text    |           |          | 
Indexes:
    "ss_pkey" PRIMARY KEY, btree (id)

The table currently has two rows:

id |  name   
----+---------
  0 | Matthew
  1 | Thomas
(2 rows)

Now I have two transactions running concurrently:

Transaction 1

BEGIN;

WITH locked_rows AS(
    SELECT *
    FROM TEST
    FOR UPDATE
)
INSERT INTO TEST
VALUES 
(1, 'Chris'),
(0, 'John') 
ON CONFLICT(id) DO
UPDATE
SET name = excluded.name
WHERE test.name <> excluded.name

COMMIT;

Transaction 2

BEGIN;

WITH locked_rows AS(
    SELECT *
    FROM TEST
    FOR UPDATE
)
INSERT INTO TEST
VALUES 
(0, 'Bryan'),
(1, 'Steven') 
ON CONFLICT(id) DO
UPDATE
SET name = excluded.name
WHERE test.name <> excluded.name

COMMIT;

The values to be inserted are purposefully inserted in reverse order between the transactions because I want to trigger a deadlock and see if it can be avoided by using SELECT...FOR UPDATE;. Now I understand that changing the order will prevent the deadlock but I still want to understand what’s happening here.

When these two transactions are executed concurrently, they always deadlock with the error:

ERROR:  deadlock detected
DETAIL:  Process 29588 waits for ShareLock on transaction 255002; blocked by process 29010.
Process 29010 waits for ShareLock on transaction 255003; blocked by process 29588.
HINT:  See server log for query details.
CONTEXT:  while inserting index tuple (0,27) in relation "test"
SQL state: 40P01

but if I remove the CTE and simply break each transaction into two separate commands, e.g:

BEGIN;

SELECT * FROM TEST FOR UPDATE;

INSERT INTO TEST
VALUES 
(1, 'Chris'),
(0, 'John') 
ON CONFLICT(id) DO
UPDATE
SET name = excluded.name
WHERE test.name <> excluded.name

COMMIT;

and the same for the other transaction, then the deadlocks disappear and the second transaction simply waits for the first one to release the lock on the rows, which is what I expect to happen.

I would expect the same to happen when I use a CTE since the SELECT...FOR UPDATE is still executed in the transaction, but that is not the case. I have also tried declaring the CTE AS MATERIALIZED to see if that will make a difference but nothing changed.
From the Postgres docs, I see this:

If specific tables are named in a locking clause, then only rows coming from those tables are locked; any other tables used in the SELECT are simply read as usual. A locking clause without a table list affects all tables used in the statement. If a locking clause is applied to a view or sub-query, it affects all tables used in the view or sub-query. However, these clauses do not apply to WITH queries referenced by the primary query. If you want row locking to occur within a WITH query, specify a locking clause within the WITH query.

but I don’t really understand what it means and I think this is referring to the [ OF table_name [, ...] ] optional argument in the FOR UPDATE.

Is postgres completely ignoring the CTE because it is not used in subsequent queries? If that’s the case, then why do other commands such as DELETE or UPDATE work properly inside an unused CTE? Or is something else happening behind the scenes?

Note: I’m using PostgreSQL 15.2

2

Answers


  1. both transactions are trying to acquire an exclusive lock on the same row at the same time.

    In the first transaction, you are using a CTE to select all rows from the test table and lock them with FOR UPDATE. This means that the first transaction will acquire an exclusive lock on all rows in the test table.

    In the second transaction, you are also using a CTE to select all rows from the test table. However, this CTE does not use the FOR UPDATE clause. This means that the second transaction will only acquire a shared lock on the rows in the test table.

    When the second transaction tries to update a row that is already locked by the first transaction, a deadlock will occur.

    To avoid this deadlock, you can either:

    • Use the FOR UPDATE clause in both CTEs
    • Use a single transaction to perform both the select and the update

    Using FOR UPDATE clause in both CTEs

    BEGIN;
    
    WITH locked_rows AS(
        SELECT *
        FROM TEST
        FOR UPDATE
    ),
    updated_rows AS(
        INSERT INTO TEST
        VALUES 
        (1, 'Chris'),
        (0, 'John') 
        ON CONFLICT(id) DO
        UPDATE
        SET name = excluded.name
        WHERE test.name <> excluded.name
    )
    SELECT *
    FROM locked_rows
    JOIN updated_rows USING (id);
    
    COMMIT;
    

    single transaction to perform both the select and the update:

    BEGIN;
    
    SELECT *
    FROM TEST
    FOR UPDATE;
    
    INSERT INTO TEST
    VALUES 
    (1, 'Chris'),
    (0, 'John') 
    ON CONFLICT(id) DO
    UPDATE
    SET name = excluded.name
    WHERE test.name <> excluded.name;
    
    COMMIT;
    

    the second CTE is also using the FOR UPDATE clause. However, the deadlock still occurs because the second transaction is trying to acquire an exclusive lock on a row that is already locked by the first transaction.

    The reason why the second transaction is able to acquire a shared lock on the rows in the test table is because the first transaction has already acquired an exclusive lock on those rows. This is because the FOR UPDATE clause is a blocking lock. This means that the second transaction will wait until the first transaction releases the exclusive lock on the rows before it can acquire a shared lock on those rows.

    However, when the second transaction tries to update a row that is already locked by the first transaction, a deadlock will occur because the second transaction is now trying to acquire an exclusive lock on the row. This will cause the first transaction to block, and the second transaction to block. This will create a deadlock situation that cannot be resolved.

    To avoid this deadlock, you can either use the FOR UPDATE clause in both CTEs, or you can use a single transaction to perform both the select and the update.

    Login or Signup to reply.
  2. You are getting a deadlock because the CTE is never executed.

    A CTE is only executed if it contains a DML statement or if it is used in the main statement. See the documentation:

    Data-modifying statements in WITH are executed exactly once, and always to completion, independently of whether the primary query reads all (or indeed any) of their output. Notice that this is different from the rule for SELECT in WITH: as stated in the previous section, execution of a SELECT is carried only as far as the primary query demands its output.

    Two notes about avoiding deadlocks with SELECT ... FOR UPDATE:

    1. unless you plan to delete the row or modify a key, you should use FOR NO KEY UPDATE to avoid excessive locking

    2. two sessions running SELECT ... FROM tab FOR NO KEY UPDATE could still deadlock with each other

      • because of synchronized sequential scans (see the synchronize_seqscans parameter)

      • if there are concurrent updates on the table, and the snapshots of both SELECTs see the rows in different orders

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search