skip to Main Content

I’ve a mongo collection having around 550,000 documents. Document has an array field path on which i have a below query in my java code. This field is indexed.

Problem is ids in below query can go up to 6000 causing the query to take ~8 secs. Tried to use aggregator to bring it down but no luck.

Could some one please guide here what else can be done.

Query query = new Query(Criteria.where("ancestors").is(null).and("path").in(ids));
var data = mongoTemplate.findDistinct(query, "path", Orders.class, String.class);

2

Answers


  1. When you use $in operator for find queries, after ~200 units, it converts the query to individual IO queries.
    You should rather convert the usecase to range operator if they are in seq.

    Ex:

    ids.sort()
    { path: {gte: ids[0], lte: ids[n]} }
    

    you can get a cursor with this filter and using next you can fetch records one by one.
    this will not overload your system IO resources, you will get the data even faster without context switch as well.

    If your ids are not in seq, then you can break that large req in multiple req of 100 batches and call them sequentially/parallely as per your usecase.

    Login or Signup to reply.
  2. I hope this approach can help you solve your issue

    Here’s an example of how you can use the aggregation framework to optimize your query:

    Aggregation aggregation = Aggregation.newAggregation(
        Aggregation.match(Criteria.where("ancestors").is(null).and("path").in(ids)),
        Aggregation.group("path").first("path").as("path")
    );
    
    AggregationResults<String> results = mongoTemplate.aggregate(aggregation, "yourCollectionName", String.class);
    List<String> data = results.getMappedResults();
    

    Let me know if it works.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search