skip to Main Content

I have below aggregate mongo query & want to use mongo count instead of sum to check the performance.

collection sample :

[
  {
    "rate": 500,
    "orderDate": 20230101,
    "username": "Brad"
  },
  {
    "rate": 20,
    "orderDate": 20230101,
    "username": "Lewis"
  },
  {
    "rate": 200,
    "orderDate": 20230101,
    "username": "Brad"
  }
]

db.collection.aggregate([
  {
    $match: {
      orderDate: 20230101
    }
  },
  {
    $addFields: {
      validOrder: 1
    }
  },
  {
    $group: {
      _id: "$username",
      approvedOrder: {
        $sum: "$validOrder"
      }
    }
  }
])

I am getting expected output as :

[
  {
    "_id": "Lewis",
    "approvedOrder": 1
  },
  {
    "_id": "Brad",
    "approvedOrder": 2
  }
]

In the above aggregate how cam I replace $sum with mongo count ? Please note my actual condition inside addfields is different, so in above query I only wanted to change the $sum to count.

https://mongoplayground.net/p/Xe_7veGpEvD

2

Answers


  1. Replace

    $sum: "$validOrder"
    

    with

    $count: {}
    

    Demo @ Mongo Playground

    Note that $count aggregation operator is available from MongoDB version 5.0.

    Reference: $count (aggregation operator) Syntax

    Login or Signup to reply.
  2. The answer from @Yong Shun addresses the actual question that was asked, namely how to use $count instead of $sum when writing the aggregation. This answer drills down on the "to check the performance" component of the question.

    As mentioned in the comments, $count effectively provides a more concise syntax for users to express their query. Indeed, check out the stages that are reported in the explain output from the two commands:

    > db.collection.explain().aggregate([ { $match: { orderDate: 20230101 } }, {$addFields:{validOrder:1}},{ $group: { _id: "$username", approvedOrder: { $count: {} } } }]).stages
    [
      {
        '$cursor': {
          ...
        }
      },
      { '$addFields': { validOrder: { '$const': 1 } } },
      {
        '$group': { _id: '$username', approvedOrder: { '$sum': { '$const': 1 } } }
      }
    ]
    test> db.collection.explain().aggregate([ { $match: { orderDate: 20230101 } }, { $addFields: { validOrder: 1 } }, { $group: { _id: "$username", approvedOrder: { $sum: "$validOrder" } } }]).stages
    [
      {
        '$cursor': {
          ...
        }
      },
      { '$addFields': { validOrder: { '$const': 1 } } },
      {
        '$group': { _id: '$username', approvedOrder: { '$sum': '$validOrder' } }
      }
    ]
    

    They are effectively the exact same thing (since validOrder has a generated value of 1 in the original pipeline. So there will be no performance difference there.

    Without knowing more about the current performance, we can’t say anything with certainty about how you might be able to improve it. Broadly I have two general recommendations:

    1. Get rid of the $addFields stage (either by using $sum: 1 or using $count which generates the same thing). It is not providing any particular value so there’s no reason to make the pipeline larger than it has to be.
    2. Make sure you have a compound index on { orderDate: 1, username: 1 }.

    Taken together, these two things will allow the database to perform this operation as a covered query meaning that it can satisfy it by reading the index alone without fetching documents. This will represent the most efficient way you can perform this operation given the current schema.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search