skip to Main Content

Having a documents collection ordered by field, I want to group documents by other field but only adjacent documents should be grouped. Like so:

[
  {order: 1, state: 'one'},
  {order: 2, state: 'one'},
  {order: 3, state: 'one'},
  {order: 4, state: 'two'},
  {order: 5, state: 'two'},
  {order: 6, state: 'one'},
  {order: 7, state: 'two'},
  {order: 8, state: 'three'},
  {order: 9, state: 'three'}
]

should result in:

[
  [
    {order: 1, state: 'one'},
    {order: 2, state: 'one'},
    {order: 3, state: 'one'}
  ],
  [
    {order: 4, state: 'two'},
    {order: 5, state: 'two'}
  ],
  [
    {order: 6, state: 'one'}
  ],
  [
    {order: 7, state: 'two'}
  ],
  [
    {order: 8, state: 'three'},
    {order: 9, state: 'three'}
  ]
]

Unfortunately I’m quite new to MongoDb so I have little to no idea even where to start from.
I suspect this can be achieved with some king of aggregation, but have no idea where to start from.

Tried $group with _id: null and $setWindowFields and bucketing, but with no success.

2

Answers


  1. I believe you need another field to group. It does not look like it is possible to without it. Or you may use programmatic approach to create desired output.

    Login or Signup to reply.
  2. You’ll need to $sort & $group all your documents and then use $reduce to create an array-of-arrays of objects.

    For each sub-object, check whether it has to be added to the previous sub-array (when state is the same) or as a new sub-array with one doc (when state is different).

    Note that this will not work if you have too many documents since each aggregation stage is limited to 100MB. So I hope you have some $match stage to apply before the first sort.

    db.collection.aggregate([
      { $sort: { order: 1 } },
      {
        $group: {
          _id: null,
          docs: { $push: "$$ROOT" }
        }
      },
      {
        $set: {
          docs: {
            $reduce: {
              input: "$docs",
              initialValue: [],
              in: {
                $cond: {
                  if: {
                    // last value `state` is different or initial value
                    $ne: [
                      {
                        $getField: {
                          field: "state",
                          input: { $last: { $last: "$$value" } }
                        }
                      },
                      "$$this.state"
                    ]
                  },
                  then: { $concatArrays: [ "$$value", [["$$this"]] ] },
                  else: {
                    $concatArrays: [
                      // take all except the last elements of the current
                      {
                        $slice: [
                          "$$value",
                          { $subtract: [{ $size: "$$value" }, 1] }
                        ]
                      },
                      // add this element to the last subarray
                      // and concat it to the previous
                      [ { $concatArrays: [ { $last: "$$value" }, ["$$this"] ] } ]
                    ]
                  }
                }
              }
            }
          }
        }
      },
      {
        // if you want all the groups in one result, remove this stage
        $unwind: "$docs"
      }
    ])
    

    Mongo Playground – add/remove elements in the sample collection to see the behaviour for each step of the grouping. Note that each group/array has to be a value of a field in a document, you can’t just have an array as a document.

    Edit: If the documents generated is too big for the aggregation $group stage, then you can do do it programmatically; which would have been trivial. Example, in Python:

    from itertools import groupby 
    
    lst = [...]  # your input docs combined into one list/cursor/iterable
    final = [list(g) for k, g in groupby(lst, lambda e: e['state'])]
    
    # final is as per your expected output
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search