Members, Hope you and yours are keeping safe and well.
Am an end user not a coder as in we use software written on Postgres and just hoping for some information comment about a issue we have.
We have a question we are not sure the vendor is being clear about,
It is a field service management software so we onboard Customers, Contacts, Sites, Suppliers, Contractors and Employees.
The Customers, Contacts, Sites and Suppliers each get consecutive ID numbers when we add a new one i.e. Customer 1, 2, 3 etc and SUpplier 1, 2 3 etc.
But the The Employees and Contractors share a stack but the issue is they may go 1, 2 ,3 30, 76, 77, 78, 90 as in there are non uniform gaps between. We are up to ID 570 for only 110 records!
When quizzed as to why our employee ID’s are not consecutive (important to us because of payroll and other automatons separately) the vendor says;
"I have consulted with our Development Team on your case and have been informed that the reason Employee IDs are not all consecutive is because this s a safety feature built into Postgress.
We use database replication which means that the data stored in the databases is replicated across a lot of servers (not just one). This is done for redundancy and performance.
If a sequence is used (which is the case with Employee IDs) then what happens is the application asks the database for the next number in the sequence that it can use. But because there are a lot of servers involved it has to ensure that the number chosen is unique and that a different server is not also allocating the same number to a different transaction.
So it isn’t getting the next number off the list. What it does is adds the next number that the server believes is available and then decrements that until it gets a number that it can use in agreement with the other DB servers (this is to ensure uniqueness).
When there is a crash and one (or more) of the DB servers goes offline or has to recover then it creates a state in which the other DB servers can’t know if that server allocated/committed a number in the sequence or not (since replication is not possible).
So to ensure that there is no possible clash between the sequence numbers it has to start from the highest number written to the WAL (hence where the number skip comes from). This also aids in performance to ensure that recovery is quick and less prone to errors (and thus prevents data loss)."
Does not seem correct to us since other lists Customers, Suppliers etc are all consecutive.
Would just appreciate any informed comment as to whether their explanation is credible or waffle!
Thanks
We have only reviewed record counts, periods between, other lists withing software to see any patterns as would explain gaps in ID records. this is mainly just wanting comment,.
Done know what version Postgres
2
Answers
First of all, an identifier is just an identifier and nothing else. As an identifier nr. 10 is just as different to nr. 5 as it is to nr. 500. These are just 3 different pointers to 3 different records. It doesn’t tell you anything about the number of records in the database, that’s a completely different question. For example nr. 500, would that mean that you have 500 records? Or could it be that you deleted 497 records and now have just 3 records left? The number 500 can’t tell you the difference.
If you have multiple servers and they all create numbered identifiers themselves, you will have a conflict one day. The only safe solution for this, is to use UUID’s.
But no matter what, identifiers never have gaps. For an identifier, there is no gap between 5 and 10, nor 10 and 500. It just doesn’t exist.
If you have a requirement to have consecutive ID without gaps for your entities it must be requested as such by your development. There is no tecnical feature in PostgreSQL that would restrain it, but there are possible performance implications.
The
ID
‘s are typically created with sequences that guarantie uniqueness and high performance (you may create safely large number of customers in parallel), but may produce gaps.Why? A simplest example is if you fail to create a customer, you rollback the transaction and the
ID
gets lost.For a consecutive ID’s the application can not use sequences, must serialize the creation of the customers and on failure return the free key back. The serialization may have performance implications.
So basically there is no PostgreSQL feature that will prevent consecutive ID’s, but if you use sequences you can’t get it.