skip to Main Content

I have a data like:

[
  { "grp": "A", "seq": 1, "score": 1, x: 0 },
  { "grp": "A", "seq": 1, "score": 2, x: 0 },
  { "grp": "A", "seq": 1, "score": 3, x: 0 },
  { "grp": "A", "seq": 1, "score": 4, x: 0 }
]

Using $setWindowFields,

{
  partitionBy: "$grp",
  sortBy: { seq: 1 },
  output: {
    x: {
      $sum: "$score",
      window: {
        documents: ["unbounded", "current"]
      }
    },
  }
}

I get:

[
  { "grp": "A", "seq": 1, "score": 1, x: 1 },
  { "grp": "A", "seq": 1, "score": 2, x: 3 },
  { "grp": "A", "seq": 1, "score": 3, x: 6 },
  { "grp": "A", "seq": 1, "score": 4, x: 10 }
]

I only need to retain the running sum (x) for the last item of the partition.

[
  { "grp": "A", "seq": 1, "score": 1, x: 0 },
  { "grp": "A", "seq": 1, "score": 2, x: 0 },
  { "grp": "A", "seq": 1, "score": 3, x: 0 },
  { "grp": "A", "seq": 1, "score": 4, x: 10 }
]

I’ve been testing with $project and $last but I wasn’t able to make it work.

What is a better expression or additional stage do I need to use?

Thank you!

2

Answers


  1. Not sure if there is a deterministic sorting in your dataset, but with the same sorting that you are using, you can assign ordering with $documentNumber in your $setWindowFields. Then, compute $rank with the ordering field. The last document will have rank: 1. You can use this with $cond to conditionally set field x

    db.collection.aggregate([
      {
        "$setWindowFields": {
          partitionBy: "$grp",
          sortBy: {
            seq: 1
          },
          output: {
            x: {
              $sum: "$score",
              window: {
                documents: [
                  "unbounded",
                  "current"
                ]
              }
            },
            ordering: {
              $documentNumber: {}
            }
          }
        }
      },
      {
        "$setWindowFields": {
          "partitionBy": "$grp",
          "sortBy": {
            "ordering": -1
          },
          "output": {
            "rank": {
              "$rank": {}
            }
          }
        }
      },
      {
        "$set": {
          "ordering": "$$REMOVE",
          "rank": "$$REMOVE",
          "x": {
            "$cond": {
              "if": {
                $eq: [
                  1,
                  "$rank"
                ]
              },
              "then": "$x",
              "else": 0
            }
          }
        }
      }
    ])
    

    Mongo Playground

    Login or Signup to reply.
  2. Ray’s answer and cmgchess’s example are both a better approach than group, shown here. But worth having since you only need the sum on the last item in the group and not exactly a running sum. $setWindowFields is better for an actual running sum, as per the pipeline in your question.

    Here, I’m sorting and grouping, then pushing all the docs to the list (so 100MB limit issue may occur), and setting xs the sum of score.

    In order to update only the last x to this value, I slice the docs array and set x = xs only on that one, and then unwind.

    db.collection.aggregate([
      { $sort: { seq: 1 } },
      {
        $group: {
          _id: "$grp",
          docs: { $push: "$$ROOT" },
          xs: { $sum: "$score" }
        }
      },
      {
        $set: {
          docs: {
            $concatArrays: [
              {
                $slice: [
                  "$docs",
                  { $subtract: [{ $size: "$docs" }, 1] }
                ]
              },
              [
                {
                  $setField: {
                    field: "x",
                    input: { $last: "$docs" },
                    value: "$xs"
                  }
                }
              ]
            ]
          }
        }
      },
      { $unwind: "$docs" },
      { $replaceWith: "$docs" }
    ])
    

    Mongo Playground

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search