I have a collection of documents that each contain arrays. the output i want is the unwinded array elements filtered down by a certain criteria.
i can do this by doing:
db.collection.aggregate([
{
$unwind: "$myArrayField"
},
{
$match: {
"myArrayField.myCriteriaField": "myValue"
}
}
])
but it seems wasteful to unwind everything, so instead i do
db.collection.aggregate([
{
$match: {
"myArrayField.myCriteriaField": "myValue"
}
},
{
$unwind: "$myArrayField"
},
{
$match: {
"myArrayField.myCriteriaField": "myValue"
}
}
])
to filter down to viable document candidates, unwind those, and then filter down the unwound elements again since there will be elements that don’t match.
alternatively, i can do
db.collection.aggregate([
{
$project: {
myArrayField: {
$filter: {
input: "$myArrayField",
as: "element",
cond: { $eq: ["$$element.myCriteriaField", "myValue"] }
}
}
}
},
{
$unwind: "$myArrayField"
}
])
or even
db.collection.aggregate([
{
$match: {
"myArrayField.myCriteriaField": "myValue"
}
},
{
$project: {
myArrayField: {
$filter: {
input: "$myArrayField",
as: "element",
cond: { $eq: ["$$element.myCriteriaField", "myValue"] }
}
}
}
},
{
$unwind: "$myArrayField"
}
])
which one is the most efficient? i don’t understand enough about how the aggregations are run to know what variables it could depend on (collection size, document size, etc).
lastly, the match unwind match seems pretty straightforward but it also just feels wrong to have to do it like that so i’m wondering if i’m missing something.
2
Answers
First of all you should have an index on the field.
Then you should limit the result as early as possible meaning the
$match
stage should come first.The rest is more a matter of taste.
First, You must sure have an index on the field. Using
Explain
to check index hit with$match
first stage.Now, we explain your 2 suggestion:
$project
$filter
$unwind
vs$match
$project
$filter
$unwind
:$project
$filter
$unwind
: with first stage$project
, mongodb is doing a collection scan, outputtingALL documents
in that collection with that field. Now, It’s afull collection scan
andslow
$match
$project
$filter
$unwind
: with first stage$match
, mongodb is doing an index scan instead of a collection if It hit an index. After that, stage$project
will do on alimit document
that matched. Now, it aindex scan
andfaster
because mongodb only do on some document.Final, with your 2 option, the second is better