I don’t understand why simple stored procedure or anonymous block that doesn’t query any tables would prevent vacuum to remove dead rows.
I connect to postgres and run the following:
DO
$BODY$
DECLARE
v_id INT;
BEGIN
for v_id in 1..10
LOOP
RAISE INFO '%. sleeping...', v_id;
PERFORM pg_sleep(10);
END LOOP;
END;
$BODY$
LANGUAGE 'plpgsql';
While the above block is running, I open 2nd connection and run the following:
update foo set fname=md5(random()::text), lname=md5(random()::text) where id>170 and id<=180;
UPDATE 10
and then vacuum:
vacuum freeze verbose foo;
and it returns:
tuples: 0 removed, 999984 remain, 10 are dead but not yet removable
If I continue updating, the number of dead rows keeps growing. And it only gets removed when the anonymous block finishes. So I don’t understand why this simple block doesn’t let vacuum to remove dead rows?
2
Answers
You can split up your lengthy process into smaller transactions, take an alternative technique to minimizing locking, or both to prevent blocking vacuum operations while performing large transactions. Here is a smaller transaction-based coding solution:
For each loop iteration, a new transaction is started in this code. The locks are released as soon as each transaction is finished, containing the output of each iteration within its own transaction. This ought to enable the VACUUM procedure to work as intended and eliminate dead rows.
I’ve also included an exception block to deal with any problems that may arise during the iterations. You can alter logic for handling errors to meet your requirements.
When you run your UPDATE statement, it marks the rows as dead (i.e., they are no longer needed by any active transaction), but it doesn’t immediately remove them. Instead, PostgreSQL keeps these dead rows until the transaction that created them is completed. This is done to ensure transaction isolation and consistency.
In your case, the DO block you’re running is an anonymous transaction block. It keeps an open transaction throughout its execution because you have not explicitly committed or rolled back the transaction within the block. This open transaction holds a snapshot of the database, including the rows that were marked as dead by the UPDATE statement.
As long as this transaction is open, the dead rows cannot be removed by the VACUUM process because they are still visible to your transaction. Once the DO block completes (either successfully or due to an error), the transaction will be terminated, and PostgreSQL can safely remove the dead rows during the next VACUUM operation.You can either commit or rollback the transaction explicitly within your DO block as shown below.