skip to Main Content

in MongoDB I want to group an array of documents that is nested in another document without it affecting the parent document.

Database:

db={
      "users": [
        {
          "firstName": "David",
          "lastName": "Mueller",
          "messages": [
            {
              "text": "hello",
              "type": "PERSONAL"
            },
            {
              "text": "test",
              "type": "DIRECT"
            }
          ]
        },
        {
          "firstName": "Mia",
          "lastName": "Davidson",
          "messages": [
            {
              "text": "hello world",
              "type": "DIRECT"
            },
            {
              "text": ":-)",
              "type": "PERSONAL"
            },
            {
              "text": "hi there",
              "type": "DIRECT"
            }
          ]
        }
      ]
    }

Desired result:

[
      {
        "firstName": "David",
        "lastName": "Mueller",
        "messages": [
          {
            "_id": "PERSONAL",
            "count": 1
          },
          {
            "_id": "DIRECT",
            "count": 1
          }
        ]
      },
      {
        "firstName": "Mia",
        "lastName": "Davidson",
        "messages": [
          {
            "_id": "PERSONAL",
            "count": 1
          },
          {
            "_id": "DIRECT",
            "count": 2
          }
        ]
      }
]

If I have an array of ids I already know how to do it using the internal pipeline of $lookup, but my question is how can I do that with an array of embedded documents.


This is an example of a working grouping on an array with ids using lookup. This is not the solution because the question is about an embedded document array and not an array of ids. This example is only provided to show that I can archive the desired result when ids instead of embedded documents are stored in an array.

Database for grouping with lookup:

db={
  "users": [
    {
      "firstName": "David",
      "lastName": "Mueller",
      "messages": [
        1,
        2
      ]
    },
    {
      "firstName": "Mia",
      "lastName": "Davidson",
      "messages": [
        3,
        4,
        5
      ]
    }
  ],
  "messages": [
    {
      "_id": 1,
      "text": "hello",
      "type": "PERSONAL"
    },
    {
      "_id": 2,
      "text": "test",
      "type": "DIRECT"
    },
    {
      "_id": 3,
      "text": "hello world",
      "type": "DIRECT"
    },
    {
      "_id": 4,
      "text": ":-)",
      "type": "PERSONAL"
    },
    {
      "_id": 5,
      "text": "hi there",
      "type": "DIRECT"
    }
  ]
}

Aggregation of grouping with lookup:

db.users.aggregate([
  {
    "$lookup": {
      "from": "messages",
      "localField": "messages",
      "foreignField": "_id",
      "as": "messages",
      "pipeline": [
        {
          "$group": {
            "_id": "$type",
            "count": {
              "$sum": 1
            }
          }
        }
      ]
    }
  }
])

Result of grouping with lookup (which is the desired result):

[
  {
    "_id": ObjectId("5a934e000102030405000005"),
    "firstName": "David",
    "lastName": "Mueller",
    "messages": [
      {
        "_id": "PERSONAL",
        "count": 1
      },
      {
        "_id": "DIRECT",
        "count": 1
      }
    ]
  },
  {
    "_id": ObjectId("5a934e000102030405000006"),
    "firstName": "Mia",
    "lastName": "Davidson",
    "messages": [
      {
        "_id": "PERSONAL",
        "count": 1
      },
      {
        "_id": "DIRECT",
        "count": 2
      }
    ]
  }
]

This example in the MongoDB playground


Now back to the issue: I want to archive the same result but with an embedded document array as provided at the top.

I cannot find out how to do this (I tried AI, lot’s of google searches and other forums without success, you are my last resource before giving up), I know I can filter an embedded array using $addField and $fitler but not how I can group just the embedded array.

Please note that this is just a simple example my real data structure looks different and might also use other grouping functions like min, sum etc. but I just wanted to know a general way of archieving the same thing as when I use the lookup.

I appreciate any help with this and thank you 🙂

2

Answers


    1. Unwind messages
    2. Group by the _id (presumably userID) and message type; and set countType with $count.
    3. Then re-group by _id only and use the first doc (since it’s the same for non-message fields)
      • Push each {type: ..., count: countType} into a messages array.
      • Note that the documents are back to being one per user/_id
    4. Set the doc.messages to the array messages which was pushed in the previous step.
    5. Replace the root with the new doc which has all the correct info
    6. (Optionally, sort on _id if you need it.)
    db.users.aggregate([
      { $unwind: "$messages" },
      {
        $group: {
          _id: {
            _id: "$_id",
            type: "$messages.type"
          },
          countType: { $count: {} },
          doc: { $first: "$$ROOT" }
        }
      },
      {
        $group: {
          _id: "$_id._id",
          doc: { $first: "$doc" },
          messages: {
            $push: {
              // since you want message type as `_id`
              _id: "$_id.type",
              count: "$countType"
            }
          }
        }
      },
      { $set: { "doc.messages": "$messages" } },
      { $replaceWith: "$doc" }
    ])
    

    Mongo Playground

    Login or Signup to reply.
  1. here is an alternative way of doing without group and unwind.

    1. $setUnion: "$messages.type" will give a list of the unique $messages.type so now you are essentially looping it using $map
    2. In each iteration of the loop I return an object in the form of {_id, count}. To calculate the count I’m filtering the initial messages array based on the type and getting the size of the filtered array
    db.users.aggregate([
      {
        $addFields: {
          messages: {
            $map: {
              input: { $setUnion: "$messages.type" },
              as: "type",
              in: {
                _id: "$$type",
                count: {
                  $size: {
                    $filter: {
                      input: "$messages",
                      as: "msg",
                      cond: { $eq: [ "$$msg.type", "$$type" ] }
                    }
                  }
                }
              }
            }
          }
        }
      }
    ])
    

    playground

    The above query will make sense if you could understand this. This is how i came up with the logic actually 🙂

    const db={ "users": [ { "firstName": "David", "lastName": "Mueller", "messages": [ { "text": "hello", "type": "PERSONAL" }, { "text": "test", "type": "DIRECT" } ] }, { "firstName": "Mia", "lastName": "Davidson", "messages": [ { "text": "hello world", "type": "DIRECT" }, { "text": ":-)", "type": "PERSONAL" }, { "text": "hi there", "type": "DIRECT" } ] } ] }
        
    const res = db.users.map(user => {
      const uniqTypes = [...new Set(user.messages.map(msg => msg.type))];
      const messages = uniqTypes.map(type => {
        const count = user.messages.filter(msg => msg.type === type).length;
        return {_id: type, count};
      })
      return {...user, messages};
    })
        
    console.log(res)
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search