skip to Main Content

We notice a lot of trouble within our django application, where we use get_or_create.

model, created = SomeModel.objects.get_or_create(user=user, name=name, defaults={...});
if created:
    get_some_dimensions # SLOW as we call to third party
    model.dimension_values.set(created_dimensions)

# Code here that depends on dimensions being there for the model

The problem is that the code above runs twice at the same time.

  1. The first time it creates the model, it starts "get_some_dimensions" but doesn’t finish that (so not saved)
  2. then the second time runs and it sees the model is already there
  3. the second run however goes to the "cod here that depends on dimensions" before the first run saves.

At step 3 the database gets transformed to an erroneous state.

How can I prevent the second run of actually running, lock the database, until the full object is built?

2

Answers


  1. Wrap it in a transaction?

    with transaction.atomic():
    
        model, created = SomeModel.objects.get_or_create(user=user, name=name, defaults={...});
        if created:
            get_some_dimensions # SLOW as we call to third party
            model.dimension_values.set(created_dimensions)        
            # model.save() if .set doesn't do that
    
    # the object doesn't exist until here where the transaction completes (exit with block)
    

    But I’m not a DB expert, so I’m not sure what happens if another get_or_create happens while the third party is being slow. Does the database stall the second operation until the transaction terminates? Might that be regarded as a bug by the user if the third party request might take tens of seconds?

    Another way would be to test if SomeModel.objects.filter(...).exists() and if not, request the information from the slow third party before doing a get_or_create (supplying the third-party information to that call so the object is created in a complete state). The only drawback here is a possible duplicate request to the third party for information that’s not needed. (Does it cost big money per call? )

    Login or Signup to reply.
  2. Try using select_for_update with transaction.atomic. It will lock the rows until the transaction is committed, i.e. if there is code that will try to get the locked model, it will wait until that lock is released. Example:

    with transaction.atomic():
        model, created = SomeModel.objects.get_or_create(user=user, name=name, defaults={...})
        # At this point, the selected row is locked.
        if created:
            get_some_dimensions # SLOW as we call to third party
            model.dimension_values.set(created_dimensions)
        # Once this block completes and commits, the lock is released.
    

    Although you may customize behaviour of the waiting code using skip_locked or no_wait parameters.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search