skip to Main Content

I don’t understand why simple stored procedure or anonymous block that doesn’t query any tables would prevent vacuum to remove dead rows.

I connect to postgres and run the following:

DO
$BODY$
DECLARE
    v_id INT;
BEGIN
    for v_id in 1..10 
    LOOP
        RAISE INFO '%. sleeping...', v_id;
        PERFORM pg_sleep(10);
    END LOOP;
END;
$BODY$
LANGUAGE 'plpgsql';

While the above block is running, I open 2nd connection and run the following:

update foo set fname=md5(random()::text), lname=md5(random()::text) where id>170 and id<=180;
UPDATE 10

and then vacuum:

vacuum freeze verbose foo;

and it returns:

tuples: 0 removed, 999984 remain, 10 are dead but not yet removable

If I continue updating, the number of dead rows keeps growing. And it only gets removed when the anonymous block finishes. So I don’t understand why this simple block doesn’t let vacuum to remove dead rows?

2

Answers


  1. You can split up your lengthy process into smaller transactions, take an alternative technique to minimizing locking, or both to prevent blocking vacuum operations while performing large transactions. Here is a smaller transaction-based coding solution:

    DO $$ 
    DECLARE 
        v_id INT; 
    BEGIN 
        FOR v_id IN 1..10 
        LOOP 
            RAISE INFO '%. sleeping...', v_id; 
            -- Start a new transaction for each iteration
            BEGIN
                -- Perform your work here
                PERFORM pg_sleep(10);
            EXCEPTION
                -- Handle exceptions if needed
                WHEN others THEN
                    -- Log or handle the error
                    RAISE WARNING 'Error in iteration %: %', v_id, SQLERRM;
            END;
        END LOOP; 
    END; 
    $$ LANGUAGE plpgsql;
    

    For each loop iteration, a new transaction is started in this code. The locks are released as soon as each transaction is finished, containing the output of each iteration within its own transaction. This ought to enable the VACUUM procedure to work as intended and eliminate dead rows.

    I’ve also included an exception block to deal with any problems that may arise during the iterations. You can alter logic for handling errors to meet your requirements.

    Login or Signup to reply.
  2. When you run your UPDATE statement, it marks the rows as dead (i.e., they are no longer needed by any active transaction), but it doesn’t immediately remove them. Instead, PostgreSQL keeps these dead rows until the transaction that created them is completed. This is done to ensure transaction isolation and consistency.
    In your case, the DO block you’re running is an anonymous transaction block. It keeps an open transaction throughout its execution because you have not explicitly committed or rolled back the transaction within the block. This open transaction holds a snapshot of the database, including the rows that were marked as dead by the UPDATE statement.
    As long as this transaction is open, the dead rows cannot be removed by the VACUUM process because they are still visible to your transaction. Once the DO block completes (either successfully or due to an error), the transaction will be terminated, and PostgreSQL can safely remove the dead rows during the next VACUUM operation.You can either commit or rollback the transaction explicitly within your DO block as shown below.

    DO
    $BODY$
    DECLARE
        v_id INT;
    BEGIN
        for v_id in 1..10 
        LOOP
            RAISE INFO '%. sleeping...', v_id;
            PERFORM pg_sleep(10);
        END LOOP;
        
        -- Commit the transaction to release the locks and remove dead rows.
        COMMIT;
    END;
    $BODY$
    LANGUAGE 'plpgsql';
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search