skip to Main Content

MongoDB’s db.collection.find().limit(10000) does NOT return the first 10000 documents I inserted. How would I make MongoDB return documents in the order they were inserted?

2

Answers


  1. You can sort items by _id field.

    id field is auto-increment and you can sort item in asc or desc to get last or first of you collection.

    // sort desc
    db.find({},{_id:-1}).limit(10000) 
    // sort asc
    db.find({},{_id:1}).limit(10000) 
    
    Login or Signup to reply.
  2. Mongodb sorts document based on a natual order. However, as you have found out, it is not 100% reliable. Documents grow and move internally.

    Since you have 10s of millions of rows you face the problem of having to use .find() which will select all documents, then sort them, then limit to 10000. Not great for performance, I agree.

    Mongodb has a $natural operator that you can use to sort results in their natural order (on disk) but best practice is to create a view and use it that way.

    You could try this and check for performance:

    db.collection.find({_id: {$ne : null}}).sort({ $natural: 1 }).limit(10000);
    

    Note the use of {_id: {$ne : null}}. This forces $natural to use indexes to fulfil the query (speed things up).

    Queries that include a sort by $natural order do not use indexes to fulfill the query predicate with the following exception: If the query predicate is an equality condition on the _id field { _id: }, then the query with the sort by $natural order can use the _id index.

    Disclaimer:

    The $natural parameter returns items according to their natural order within the database. This ordering is an internal implementation feature, and you should not rely on any particular ordering of the documents.

    However, I would think sorting by _id would result in better insertion ordering. That’s becuase if your _id field is an ObjectId then:

    ObjectId is a 12-byte BSON type, constructed using:
    a 4-byte value representing the seconds since the Unix epoch,
    a 3-byte machine identifier,
    a 2-byte process id, and
    a 3-byte counter, starting with a random value.

    So the 4-byte value representing the seconds since the Unix epoch will always be incremental. Try this way:

    db.collection.find().sort({_id:1}).limit(10000);
    

    You could then do this to get the next 10000:

    db.collection.find().sort({_id:1}).limit(20000).skip(10000);
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search