I have an aggregation pipeline in which I’m trying to find the unique values within an array property and also the number of instances of each value within each document’s array.
I arrive at a stage where the documents look like this:
{
_id: ObjectId("5d8cac657d2d1e0145268fb4"),
values: [
"5d8a2c3d4768d9660d3ba383",
"5d8a1d4f4768d951cb6c8989"
]
},
{
_id: ObjectId("5d8caf4039321e7b14061a46"),
values: [
"5d8a2c3d4768d9660d3ba383",
"5d8c835a39321e7b15792353"
]
}
and from here I’d like to end up with a result like the following:
{
values: {
"5d8a2c3d4768d9660d3ba383": 2,
"5d8a1d4f4768d951cb6c8989": 1,
"5d8c835a39321e7b15792353" : 1
}
}
or
{
values: [
{ "id": "5d8a2c3d4768d9660d3ba383", "count" : 2 },
{ "id": "5d8a1d4f4768d951cb6c8989", "count" : 1 },
{ "id": "5d8c835a39321e7b15792353", "count" : 1 }
]
}
I’m not too concerned with the format of the result, as long as I have the unique values and their counts I can work with it — whatever the most performant approach is is fine. I can get the unique values by using a $group stage with $push and $reduce but I’m not able to also get the counts that way — could someone point me in the right direction?
2
Answers
If you want the first option you can try this query:
$unwind
the arrayvalues
to be able to get individualy each element.$group
to get the count for each element._id
asnull
to get all values into one single array. Every element here will be an object withk
andv
attributes. That will be used in the next stage.$project
to not output the_id
value (wich innull
) and to use $arrayToObject. Usingk
andv
in the previous stage it created an array with objects valid to be converted into the desired object.Example here
And for the second option you can try this query:
$$ROOT
to get the entire object. As here is not needed anything else you can set the object as it is.$project
only to not output the_id
value.Example here
Edit: I didn’t notice te second example has
id
and not_id
. If it is relevant you can use this query instead which is the same but instead of using$$ROOT
to set the entire object at it is, it create theid/count
object.You could use
$facet
as it allows you to:While not an overly complex query it means you can
$unwind
and$group
in the same$facet
with the output of the sub-pipelines added to avalues
array like so:See HERE for a working example.