skip to Main Content

I have a financial system where users have tokens and can add transactions. The system has to calculate the balance and mean acquisition price of each token. Data integrity is of utmost importance in the system and it should be impossible to have incorrect balances or mean prices in the system.

To comply with these requirements I’ve come up with the following tables:

  • token (to hold each token)
  • transaction (to hold each transaction of a token)
  • balance (to hold the token balances without having to calculate each time using all transactions)

The token and transaction tables are straight forward. The balance table is a table that is automatically updated using a PostgreSQL trigger to hold each change of balance in a token. This table exists so every time we need to know something like "What was the balance/mean price of token A in 2023-01-05?" we don’t need to sum all transactions and calculate from scratch.

Trigger

Enough of explanation, this is the trigger I’ve come up with. It fires AFTER every INSERT in the transaction table.

DECLARE
    old_balance NUMERIC(17, 8);
    old_mean_price NUMERIC(17, 8);
    old_local_mean_price NUMERIC(17, 8);
    new_balance NUMERIC(17, 8);
    new_mean_price NUMERIC(17, 8);
    new_local_mean_price NUMERIC(17, 8);
BEGIN
    -- Prevent the creation of retroactive transaction since it would mess up the balance table
    IF EXISTS (
        SELECT * FROM transaction
        WHERE
          token_id = NEW.token_id
          AND date > NEW.date
      ) THEN
      RAISE EXCEPTION 'There is already a newer transaction for token %', NEW.token_id;
    END IF;

    -- Fetch the latest balance of this token
    SELECT
      amount,
      mean_price,
      local_mean_price
    INTO
      old_balance, old_mean_price, old_local_mean_price
    FROM balance
    WHERE
      token_id = NEW.token_id
      AND date <= NEW.date
    ORDER BY date DESC
    LIMIT 1;

    -- If there's no balance in the table then set everything to zero
    old_balance := COALESCE(old_balance, 0);
    old_mean_price := COALESCE(old_mean_price, 0);
    old_local_mean_price := COALESCE(old_local_mean_price, 0);

    -- Calculate the new values
    IF NEW.side = 'buy' THEN
      new_balance := old_balance + NEW.quantity;
      new_mean_price := (old_balance * old_mean_price + NEW.quantity * NEW.unit_price) / new_balance;
      new_local_mean_price := (old_balance * old_local_mean_price + NEW.quantity * NEW.local_unit_price) / new_balance;
    ELSIF NEW.side = 'sell' THEN
      new_balance := old_balance - NEW.quantity;
      new_mean_price := old_mean_price;
      new_local_mean_price := old_local_mean_price;
    ELSE
      RAISE EXCEPTION 'Side is invalid %', NEW.side;
    END IF;

    -- Update the balance table
    IF NOT EXISTS (
        SELECT * FROM balance
        WHERE
          date = NEW.date
          AND token_id = NEW.token_id
        ) THEN
      -- Create a row in the balance table
      INSERT INTO balance
         (date, token_id, amount, mean_price, local_mean_price)
      VALUES
        (
          NEW.date,
          NEW.token_id,
          new_balance,
          new_mean_price,
          new_local_mean_price
        );
    ELSE
      -- There's already a row for this token and date in the balance table. We should update it.
      UPDATE balance
      SET
         amount = new_balance,
         mean_price = new_mean_price,
         local_mean_price = new_local_mean_price
      WHERE
         date = NEW.date
         AND token_id = NEW.token_id;
    END IF;

    RETURN NULL;
END;

This trigger does some things:

  1. Prevents the insertion of retroactive transactions, since this means we would have to update all the following balances
  2. Add a new row in the balance table with the updated balance and mean prices of the token
  3. Or, update the row in the balance if one already exists with the same datetime

Race condition

This works fine, but it has a race condition when executing 2 concurrent transactions. Imagine the following scenario:

  1. Start T1 using BEGIN
  2. Start T2 using BEGIN
  3. T1 inserts a row in the transaction table
  4. The trigger is fired inside T1 and it inserts a row in balance
  5. T2 inserts a row in the transaction table
  6. The trigger is fired inside T2 but it cannot see the changes made from the T1 trigger since it has not commited yet
  7. The balance created by T2 is incorrect because it used stale data

Imperfect solution 1

Maybe I could change the SELECT statement in the trigger (the one that selects the previous balance) to use a SELECT FOR UPDATE. This way the trigger is locked until a concurrent trigger is commited. This doesn’t work because of three things:

  1. If it’s the first transaction then the table balance doesn’t have a row for that particular token (this could be solved by locking the token table)
  2. Even if we lock and wait the concurrent transaction to commit, because of the way transaction works in PostgreSQL we would still fetch stale data since inside a transaction we only have visibility of the data that was there when the transaction started.
  3. Even if we managed to get the most up to date information, there’s still the issue that T1 can rollback and this means that the balance generated in T2 would still be incorrect

Imperfect solution 2

Another solution would be to scrap the FOR UPDATE and just defer the trigger execution to the transaction commit. This solves the race condition since the trigger is executed after the end of the transaction and has visibility of the most recent changed. The only issue is that it leaves me unable to use the balance table inside the transaction (since it will only be updated after the transaction commits)

Question

I have two questions regarding this:

  1. Does the Imperfect solution 2 really solves all the race condition problems or am I missing something?
  2. Is there a way to solve this problem and also update the balance table ASAP?

2

Answers


  1. Your solution 2 only narrows the race condition, but does not fix it. Both transactions could commit at the same time.

    There are only two ways to prevent such a race condition:

    • use the SERIALIZABLE transaction isolation level (you can set that as default value with the parameter default_transaction_isolation)

    • lock whatever is necessary to prevent concurrent operations (for example, the corresponding row in balance)

    Besides, your code can be improved: You should check for the existence of a balance only once, and you could use INSERT ... ON CONFLICT.

    You could read my article for a more detailed analysis.

    Login or Signup to reply.
  2. You could also consider either creating an extra table containing the running transactions and trowing an error if an insert into that is not possible or simply locking the relevant balance rows and forcing the transactions to run fully sequentially that way. Either way you force conflicting statements to run one at a time thus resolving the race condition.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search