We notice a lot of trouble within our django application, where we use get_or_create.
model, created = SomeModel.objects.get_or_create(user=user, name=name, defaults={...});
if created:
get_some_dimensions # SLOW as we call to third party
model.dimension_values.set(created_dimensions)
# Code here that depends on dimensions being there for the model
The problem is that the code above runs twice at the same time.
- The first time it creates the model, it starts "get_some_dimensions" but doesn’t finish that (so not saved)
- then the second time runs and it sees the model is already there
- the second run however goes to the "cod here that depends on dimensions" before the first run saves.
At step 3 the database gets transformed to an erroneous state.
How can I prevent the second run of actually running, lock the database, until the full object is built?
2
Answers
Wrap it in a transaction?
But I’m not a DB expert, so I’m not sure what happens if another
get_or_create
happens while the third party is being slow. Does the database stall the second operation until the transaction terminates? Might that be regarded as a bug by the user if the third party request might take tens of seconds?Another way would be to test if
SomeModel.objects.filter(...).exists()
and if not, request the information from the slow third party before doing aget_or_create
(supplying the third-party information to that call so the object is created in a complete state). The only drawback here is a possible duplicate request to the third party for information that’s not needed. (Does it cost big money per call? )Try using select_for_update with transaction.atomic. It will lock the rows until the transaction is committed, i.e. if there is code that will try to get the locked model, it will wait until that lock is released. Example:
Although you may customize behaviour of the waiting code using skip_locked or no_wait parameters.