transaction id data type’s possible values= 2^32= ~ 4 billion, then why does postgres declares a transaction wraparound and shuts down database at 2~31 2 billion transactions.
Question posted in PostgreSQL
The official documentation can be found here.
The official documentation can be found here.
3
Answers
That’s because PostgreSQL does not implement unsigned integers since it’s not in the SQL standard.
That’s because of what transaction IDs are considered future and which ones are in the past. If you view the space of transaction IDs as a circle, so that if you overflow at 232, you start at 0 again, the separation between past and future is diametrically across from your current transaction ID. So if your transaction ID crosses 231, low transaction IDs change from being in the distant past to being in the remote future. As a consequence, old, visible rows might suddenly become invisible, and long deleted rows that have never been vacuumed away could "rise from the dead". The reason for all that is that the visibility of a PostgreSQL row is governed by the system columns
xmin
andxmax
, which are transaction IDs.Let me add an example: if your current transaction ID is 42, then the future would consist of all transaction IDs between 42 and 231+42, while the past are all transaction IDs between 231+42 and 232, plus all transaction IDs between 3 and 42 (the transaction IDs 0, 1 and 2 have a special meaning).
Postgres does not necessarily shut down the database after 2^31 transactions. It only does so if
VACUUM
is not run in between. If it runs regularly, you can execute as many transactions as you like (i.e. trillions) and the transaction-ids will happily wrap around without any problems.Therefore, for each transaction-id there must be both, ids that compare lower, and ids that compare higher. To achieve that, Postgres uses modulo 2^32 arithmetic.
What
VACUUM
does is it marks rows from "old" transactions as "frozen", meaning they compare as older than all currently running transactions regardless of their actual id. This way the frozen rows are visible to all currently running transactions, even if simply comparing their transaction ids would suggest otherwise.