skip to Main Content

I need each document in a collection to be updated only if its content is different, regardless of the order of the elements in nested lists.

Fundamentally, two versions should be the same if the elements are identical regardless of their order. MongoDB does not do that, by default.

def upsert(query, update):
    # collection is a pymongo.collection.Collection object
    result = collection.update_one(query, update, upsert=True)
    print("tFound match: ", result.matched_count > 0)
    print("tCreated: ", result.upserted_id is not None)
    print("tModified existing: ", result.modified_count > 0)

query = {"name": "Some name"}

update = {"$set": {
    "products": [
        {"product_name": "a"},
        {"product_name": "b"},
        {"product_name": "c"}]
}}
print("First update")
upsert(query, update)

print("Same update")
upsert(query, update)

update = {"$set": {
    "products": [
        {"product_name": "c"},
        {"product_name": "b"},
        {"product_name": "a"}]
}}
print("Update with different order of products")
upsert(query, update)

Output:

First update
Found match:  False
Created:  True
Modified existing:  False

Same update 
Found match:  True
Created:  False
Modified existing:  False

Update with different order of products
Found match:  True
Created:  False
Modified existing:  True

The last update does modify the document because the order of products are indeed different.

I did find a working solution which is to compare a sorting of the queried document’s content and a sorting of the new one.

Thanks to Zero Piraeus‘s response for the short and convenient way to sort for comparison.

def ordered(obj):
    if isinstance(obj, dict):
        return sorted((k, ordered(v)) for k, v in obj.items())
    if isinstance(obj, list):
        return sorted(ordered(x) for x in obj)
    else:
        return obj

I apply it to compare the current and the new versions of the document. If their sorting are different, I apply the update.

new_update = {
    "products": [
        {"product_name": "b"},
        {"product_name": "c"},
        {"product_name": "a"}]
}

returned_doc = collection.find_one(query)
# Merging remote document with local dictionary
merged_doc = {**returned_doc, **new_update}
if ordered(returned_doc) != ordered(merged_doc):
    upsert(query, {"$set": new_update})
    print("Updated")
else:
    print("Not Updated")

Output:

Not Updated

That works, but that relies on python to do the comparison, introducing a delay between the read and the write.

Is there a way to do it atomically ? Or, even better, a way to set a MongoDB Collection to adopt some kind of "order inside arrays doesn’t matter" mode ?

This is part of a generic implementation. Documents can have any kind of nesting in their structure.

2

Answers


  1. EDIT:
    According to your comment (and the resemblance to other question) I suggest:

    db.collection.updateMany(
      {"name": "Some name"},
      [{
        $set: {products: {
            $cond: [
              {$setEquals: [["a", "c", "b"], "$products.product_name"]},
              "$products",
              [{"product_name": "a"}, {"product_name": "c"}, {"product_name": "b"}]
            ]
        }}
      }]
    )
    

    See how it works on the playground example

    Original answer:

    One option is to use the query part of the update to work only on documents that are matching your condition:

    db.collection.update(
      {"name": "Some name",
       "products.product_name": {
        $not: {$all: ["a", "b", "c"]}}
      },
      {"$set": {
        "products": [
          {"product_name": "b"},
          {"product_name": "c"},
          {"product_name": "a"}
        ]
      }}
    )
    

    See how it works on the playground example

    Login or Signup to reply.
  2. @nimrodserok correctly pointed out a flaw in my first answer. Here’s my updated answer that’s a slight variation of his answer.

    This should also preserve the upsert option.

    new_new_update = [
      {
        "$set": {
          "products": {
            "$let": {
              "vars": {
                "new_products": [
                  {"product_name": "b"},
                  {"product_name": "c"},
                  {"product_name": "a"}
                ],
                "current_products": {
                  # need this for upsert
                  "$ifNull": ["$products", []]
                }
              },
              "in": {
                "$cond": [
                  {"$setEquals": ["$$current_products", "$$new_products"]},
                  "$$current_products",
                  "$$new_products"
                ]
              }
            }
          }
        }
      }
    ]
    

    Here’s a mongoplayground.net example to demonstrate the concept. You can change the "name" value to verify the upsert option.

    I’m curious what the result values are for this update_one.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search