skip to Main Content

I am using Mongo to store multi tenant data. As part of data cleanup for a tenant I want to delete everything related to the tenant. The tenantId is indexed but there are alot of rows and it takes a long time to query and I have no easy way to get the progress.

Currently I do something

db.records.deleteMany({tenantId: x})

Is there a better way?

Thinking of doing in batches but like query for x records then build a list of ids to delete. Seems very manual but isit the recommended way?

5

Answers


  1. Some options that I can think of.

    1. Drop the index, before deleting. You can recreate the index after the deletion.

    2. Change the write concern to a lower value, possibly 0. Request won’t wait for acknowledgement from secondaries.

      db.records.deleteMany({tenantId: x},{w : 0});

    3. If there is another field with enough cardinality to reduce the number of documents, try including that in the query.

      Ex: if anotherField as 0,1,2,3 as values, then execute the delete command 4 times, each time with different value.

      db.records.deleteMany({tenantId: x, anotherField: 0},{w : 0});
      db.records.deleteMany({tenantId: x, anotherField: 1},{w : 0});
      db.records.deleteMany({tenantId: x, anotherField: 2},{w : 0});
      db.records.deleteMany({tenantId: x, anotherField: 3},{w : 0});
      
    Login or Signup to reply.
  2. I would suggest two solutions, and also please export your model If anything goes wrong you will have a backup of your data or try this in your test DB first 

    1. you can use your tenantId as a condition, not matching _id but with extra logic, like if any of the records do have the tenantId delete them so this way all of your tenant data will be removed using a single query.

      db.records.deleteMany({tenantId : {$exists: true})
      // suggestion- if any of your tenant data has a field tenantId but it is null you can check for a null value also to delete those records.
      

     
    2) find command data in all of the records, if there is use it as a condition to delete those records.
    for example, all of your tenant data have a common field called type with the same value use delete statement like

       db.records.deleteMany({type : 1})
    
    Login or Signup to reply.
  3. DeleteMany I think, There must be something common between all the rows that you want to remove from the collection.

    You can find out something and then create a query accordingly.

    this will help you to remove those records fast.

    Let me give you one example. I want to remove all the records where username is not exists.

    db.collection.deleteMany({ username: {$exists: false} })
    
    Login or Signup to reply.
  4. The performance may depend on variety of different factors. But here are some options you can try to improve the performance

    Bulk operations

    Bulk operations might help here. bulk.find(query).remove() is a version of db.collection.remove(query) that optimized for large numbers of operations. You can read more about it here

    You can use the following way:

    Declare a search query:

    var query= {tenantId: x};
    

    Initialize and use a bulk:

    var bulk = db.yourCollection.initializeUnorderedBulkOp()
    bulk.find(query).remove() // or try delete() instead of remove()
    bulk.execute()
    

    The idea here rather not to speed up the removal, but to produce less load.


    Also you could try bulkWrite()

    db.yourCollection.bulkWrite([
       { deleteMany: {
          "filter" : query,
       }}
    ])
    

    TTL indexes

    It may be not suitable for your use case, but there’s entirely another approach without removing by yourself at all.

    If it is suitable for you to delete data based on a timestamp, then a TTL index might help you. The idea here is that the record is being removed when the TTL expires.

    Implemented as a special index type, TTL collections make it possible
    to store data in MongoDB and have the mongod automatically remove data
    after a specified period of time.

    Login or Signup to reply.
  5. The best place to start is to find something that all records have in common in-order to removed them all at once.
    For example the following code deletes all entries that don’t contain an email address.

    db.users.deleteMany({ email: { $exists: false } })
    

    MongoDB documentation have great examples. Link provided below.
    https://www.mongodb.com/docs/manual/reference/method/db.collection.deleteMany/#delete-multiple-documents

    You might also want to consider dropping the index since it could be recreated after your done with the operation.

    Finally you might want to lower the write concern in your operation in order to speed things up. A compile list of options can be found here
    https://www.mongodb.com/docs/v5.0/reference/write-concern/#w-option

    I found a good tutorial on https://www.geeksforgeeks.org/mongodb-delete-multiple-documents-using-mongoshell/ that might help you further.

    apologies for any grammatical mistakes since English is not my native tongue

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search