skip to Main Content

I have a collection made like this

[
  {timestamp: xxx, type:'start'},
  {timestamp: xxx, type:'log'},
  {timestamp: xxx, type:'log'},
  {timestamp: xxx, type:'log'},
  {timestamp: xxx, type:'start'},
  {timestamp: xxx, type:'log'},
  {timestamp: xxx, type:'log'}
]

how do I aggregate the start event and the log event until but not including the next start?

the result should be something like this
something like this

{
  events:[
   {timestamp: xxx, type:'start'},
   {timestamp: xxx, type:'log'},
   {timestamp: xxx, type:'log'},
   {timestamp: xxx, type:'log'}
  ]
},
{
  events:[
   {timestamp: xxx, type:'start'},
   {timestamp: xxx, type:'log'},
   {timestamp: xxx, type:'log'}
  ]
}

2

Answers


  1. It is a bit difficult, because the way you retracted your data makes your question pointless.

    I guess, timestamp is a sequence of events. In this case a solution could be this one:

    db.collection.aggregate([
       { $group: { _id: null, events: { $push: "$$ROOT" } } },
       {
          $project: {
             events: {
                $reduce: {
                   input: { $sortArray: { input: "$events", sortBy: { timestamp: 1 } } },
                   initialValue: [],
                   in: {
                      $concatArrays: ["$$value",
                         [{
                            $mergeObjects: ["$$this", {
                               branch: {
                                  $cond: {
                                     if: { $eq: ["$$this.type", 'start'] },
                                     then: { $ifNull: [{ $add: [{ $last: "$$value.branch" }, 1] }, 0] },
                                     else: { $last: "$$value.branch" }
                                  }
                               }
                            }]
                         }]
                      ]
                   }
                }
             }
          }
       },
       { $unwind: "$events" },
       { $group: { _id: "$events.branch", events: { $push: { timestamp: "$events.timestamp", type: "$events.type" } } } },
    ])
    

    Mongo Playground

    Another solution is this one:

    db.collection.aggregate([
      {
        $setWindowFields: {
          sortBy: { timestamp: 1 },
          output: {
            index: { $documentNumber: {} }
          }
        }
      },
      {
        $group: {
          _id: null,
          events: { $push: "$$ROOT" }
        }
      },
      {
        $set: {
          starts: {
            $filter: {
              input: "$events",
              cond: { $eq: [ "$$this.type", "start" ] }
            }
          }
        }
      },
      {
        $project: {
          events: {
            $map: {
              input: { $range: [ 0, { $size: "$starts" } ] },
              as: "i",
              in: {
                $slice: [
                  "$events", {
                  $subtract: [
                      { $arrayElemAt: [ "$starts.index", "$$i" ] }, 1 ] },
                      { $subtract: [
                      { $arrayElemAt: [ { $concatArrays: [ 
                          "$starts.index",
                          [ { $add: [ { $size: "$events" }, 1 ] } ] 
                          ] },
                          { $add: [ "$$i", 1 ] }
                        ]
                      },
                      { $arrayElemAt: [ "$starts.index", "$$i" ] }
                    ]
                  }
                ]
              }
            }
          }
        }
      }
    ])
    

    Mongo Playground

    Login or Signup to reply.
  2. Assumption: the timestamp field is "sortable" type

    You can use $setWindowFields to compute "grouping" for the events. The idea is to find the max timestamp with start in the window range of [unbounded, current] (i.e. find in the documents before and in current document with a $max). After computing the "grouping", just do a simple $group to put the events together in an array.

    db.collection.aggregate([
      {
        "$setWindowFields": {
          "sortBy": {
            "timestamp": 1
          },
          "output": {
            "grouping": {
              "$max": {
                "$cond": {
                  "if": {
                    "$eq": [
                      "$type",
                      "start"
                    ]
                  },
                  "then": "$timestamp",
                  "else": -1
                }
              },
              "window": {
                "documents": [
                  "unbounded",
                  "current"
                ]
              }
            }
          }
        }
      },
      {
        "$group": {
          _id: "$grouping",
          events: {
            $push: {
              timestamp: "$timestamp",
              type: "$type"
            }
          }
        }
      }
    ])
    

    Mongo Playground

    P.S. Your expected result form may suffer from MongoDB 16MB document size if a single grouping contains too many records.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search