I’ve been trying to find a way to limit the number of objects i’m pushing to arrays I’m creating while using "aggregate" on a MongoDB collection.
I have a collection of students – each has these relevant keys:
class number it takes this semester (only one value),
percentile in class (exists if is enrolled in class, null if not),
current score in class (> 0 if enrolled in class, else – 0),
total average (GPA),
max grade
I need to group all students who never failed, per class, in one array that contains those with a GPA higher than 80, and another array containing those without this GPA, sorted by their score in this specific class.
This is my query:
db.getCollection("students").aggregate([
{"$match": {
"class_number":
{"$in": [49, 50, 16]},
"grades.curr_class.percentile":
{"$exists": true},
"grades.min": {"$gte": 80},
}},
{"$sort": {"grades.curr_class.score": -1}},
{"$group": {"_id": "$class_number",
"studentsWithHighGPA":
{"$push":
{"$cond": [{"$gte": ["$grades.gpa", 80]},
{"id": "$_id"},
"$$REMOVE"]
}
},
"studentsWithoutHighGPA":
{"$push":
{"$cond": [{"$lt": ["$grades.gpa", 80]},
{"id": "$_id"},
"$$REMOVE"]
},
},
},
},
])
What i’m trying to do is limit the number of students in each of these arrays. I only want the top 16 in each array, but i’m not sure how to approach this.
Thanks in advance!
I’ve tried using limit in different variations, and slice too, but none seem to work.
3
Answers
I don’t think there is a mongodb-provided operator to apply a limit inside of a $group stage.
You could use $accumulator, but that requires server-side scripting to be enabled, and may have performance impact.
Limiting studentsWithHighGPA to 16 throughout the grouping might look something like:
Note that the score is also carried through until the very end so that partial result sets from different shards can be combined properly.
Maybe MongoDB
$facets
are a solution. You can specify different output pipelines in one aggregation call.Something like this:
This should end up with one document including two arrays.
See each facet as an aggregation pipeline on its own. So you can also include
$group
to group by classes or something else.https://www.mongodb.com/docs/manual/reference/operator/aggregation/facet/
Since mongoDb version 5.0, one option is to use
$setWindowFields
for this, and in particular, its$rank
option. This will allow to keep only the relevant students and limit their count even before the$group
step:$match
only relevant students as suggested by the OP$set
thegroupId
for thesetWindowFields
(as it can currently partition by one key only$setWindowFields
to define the rank of each student in their array$match
only students with the wanted rank$group
byclass_number
as suggested by the OP:See how it works on the playground example
*This solution will limit the rank of the students, so there is an edge case of more than n students in the array (In case there are multiple students with the exact rank of n). it can be simply solved by adding a
$slice
step