I have below aggregate mongo query & want to use mongo count instead of sum to check the performance.
collection sample :
[
{
"rate": 500,
"orderDate": 20230101,
"username": "Brad"
},
{
"rate": 20,
"orderDate": 20230101,
"username": "Lewis"
},
{
"rate": 200,
"orderDate": 20230101,
"username": "Brad"
}
]
db.collection.aggregate([
{
$match: {
orderDate: 20230101
}
},
{
$addFields: {
validOrder: 1
}
},
{
$group: {
_id: "$username",
approvedOrder: {
$sum: "$validOrder"
}
}
}
])
I am getting expected output as :
[
{
"_id": "Lewis",
"approvedOrder": 1
},
{
"_id": "Brad",
"approvedOrder": 2
}
]
In the above aggregate how cam I replace $sum with mongo count ? Please note my actual condition inside addfields is different, so in above query I only wanted to change the $sum to count.
2
Answers
Replace
with
Demo @ Mongo Playground
Note that
$count
aggregation operator is available from MongoDB version 5.0.Reference:
$count
(aggregation operator) SyntaxThe answer from @Yong Shun addresses the actual question that was asked, namely how to use
$count
instead of$sum
when writing the aggregation. This answer drills down on the "to check the performance" component of the question.As mentioned in the comments,
$count
effectively provides a more concise syntax for users to express their query. Indeed, check out thestages
that are reported in theexplain
output from the two commands:They are effectively the exact same thing (since
validOrder
has a generated value of1
in the original pipeline. So there will be no performance difference there.Without knowing more about the current performance, we can’t say anything with certainty about how you might be able to improve it. Broadly I have two general recommendations:
$addFields
stage (either by using$sum: 1
or using$count
which generates the same thing). It is not providing any particular value so there’s no reason to make the pipeline larger than it has to be.{ orderDate: 1, username: 1 }
.Taken together, these two things will allow the database to perform this operation as a covered query meaning that it can satisfy it by reading the index alone without fetching documents. This will represent the most efficient way you can perform this operation given the current schema.