skip to Main Content

I’m starting to learn Aggregate in MongoDB. I have a simple doc as below, which has 2 fields, name and examScores. examScores is an array containing multiple documents:

[
  { 
    _id: ObjectId("633199db009be219a43ae426"),
    name: 'Max',
    examScores: [ 
      { difficulty: 4, score: 57.9 },
      { difficulty: 6, score: 62.1 },
      { difficulty: 3, score: 88.5 } 
    ] 
  },
  { 
    _id: ObjectId("633199db009be219a43ae427"),
    name: 'Manu',
    examScores: [ 
      { difficulty: 7, score: 52.1 },
      { difficulty: 2, score: 74.3 },
      { difficulty: 5, score: 53.1 } ] }
    ]
  }
]

Now I query the maximum score of each person using $unwind and $group/$max as below:

db.test.aggregate([
    {$unwind: "$examScores"},
    {$group: {_id: {name: "$name"}, maxScore: {$max: "$examScores.score"}}}
])
{ _id: { name: 'Max' }, maxScore: 88.5 }
{ _id: { name: 'Manu' }, maxScore: 74.3 }

But I want the result also contains the examScores.difficulty field corresponding to name and examScores.score, like below:

[
  { _id: { name: 'Max' }, difficulty: 3, maxScore: 88.5 }
  { _id: { name: 'Manu' }, difficulty: 2, maxScore: 74.3 }
]

I know that I can use $sort + $group and $first to achieve this goal. But I want to use $getField or any other methods to get data from the root doc.

My idea is to use $project and $getField to get the difficulty field from the root doc (or $unwind version of the root doc) with the condition like ROOT.name = Aggregate.name and Root.examScores.score = Aggregate.maxScore.

It will look something like this:

{$project: 
          {name: 1, 
          maxScore: 1, 
          difficulty: 
                        {$getField: {
                         field: "$examScores.difficulty"
                         input: "$$ROOT.$unwind() with condition/filter"}
                        }
           }
}

I wonder if this is possible in MongoDB.

2

Answers


  1. Solution 1

    1. $unwind

    2. $group – Group by name. You need $push to add the $$ROOT document into data array.

    3. $project – Set the difficulty field by getting the value of examScores.difficulty from the first item of the filtered data array by matching the examScores.score with maxScore.

    db.collection.aggregate([
      {
        $unwind: "$examScores"
      },
      {
        $group: {
          _id: {
            name: "$name"
          },
          maxScore: {
            $max: "$examScores.score"
          },
          data: {
            $push: "$$ROOT"
          }
        }
      },
      {
        $project: {
          _id: 0,
          name: "$_id.name",
          maxScore: 1,
          difficulty: {
            $getField: {
              field: "difficulty",
              input: {
                $getField: {
                  field: "examScores",
                  input: {
                    $first: {
                      $filter: {
                        input: "$data",
                        cond: {
                          $eq: [
                            "$$this.examScores.score",
                            "$maxScore"
                          ]
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    ])
    

    Demo Solution 1 @ Mongo Playground


    Solution 2: $rank

    1. $unwind

    2. $rank – Ranking by partition name and sort examScores.score descending.

    3. $match – Filter the document with { rank: 1 }.

    4. $unset – Remove rank field.

    db.collection.aggregate([
      {
        $unwind: "$examScores"
      },
      {
        $setWindowFields: {
          partitionBy: "$name",
          sortBy: {
            "examScores.score": -1
          },
          output: {
            rank: {
              $rank: {}
            }
          }
        }
      },
      {
        $match: {
          rank: 1
        }
      },
      {
        $unset: "rank"
      }
    ])
    

    Demo Solution 2 @ Mongo Playground


    Opinion: I would say this approach:

    1. $sort by examScores.score descending
    2. $group by name, take the first document

    would be much easier.

    Login or Signup to reply.
  2. There’s no need to $unwind and then rebuild the documents again via $group to achieve your desired results. I’d recommend avoiding that altogether.

    Instead, consider processing the arrays inline using array expression operators. Depending on the version and exact results you are looking for, here are two starting points that may be worth considering. In particular the $maxN operator and the $sortArray operator may be of interest for this particular question.

    You can get a sense for what these two operators do by running an $addFields aggregation to see their output, playground here.

    With those as a starting point, it’s really up to you to make the pipeline output the desired result. Here is one such example that matches the output you described in the question pretty well (playground):

    db.collection.aggregate([
      {
        "$addFields": {
          "relevantEntry": {
            $first: {
              $sortArray: {
                input: "$examScores",
                sortBy: {
                  "score": -1
                }
              }
            }
          }
        },
        
      },
      {
        "$project": {
          _id: 0,
          name: 1,
          difficulty: "$relevantEntry.difficulty",
          maxScore: "$relevantEntry.score"
        }
      }
    ])
    

    Which yields:

    [
      {
        "difficulty": 3,
        "maxScore": 88.5,
        "name": "Max"
      },
      {
        "difficulty": 2,
        "maxScore": 74.3,
        "name": "Manu"
      }
    ]
    

    Also worth noting that this particular approach doesn’t do anything special if there are duplicates. You could look into using $filter if something more was needed in that regard.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search