skip to Main Content

I have a collection of documents such as:

{
    _id: 5,
    responses: [
        {
            staff: false,
            timestamp: 1000
        },
        {
            staff: true,
            timestamp: 1500
        }
    ]
}

I have a function (using $function) used to apply some custom logic with the responses array:

const diffs=[];
let current;
for (let i = 0; i < responses.length; i++) {
  if (i+1>=responses.length) break;
  if (!current && !responses[i].staff) current = responses[i];
  if (!current) continue;
  const next = responses[i+1];
  if (!next.staff) continue;
  diffs.push(next.timestamp - current.timestamp);
  current = undefined;
}
return diffs;

which basically returns array of numbers like [500, 1000, 10] etc. It can also potentially return an empty array ([]).

I want to basically combine all arrays into one (say one document returns [5, 10] and the next one returns [1, 2], the result would be [5, 10, 1, 2] — order doesn’t matter) and then calculate the average using $avg.

I was reading MongoDB docs and found $concatArrays, so to my understanding the process should be:

  1. Calculate diffs for each document, which will end up with an array like [[1, 2], [3, 4], [5, 6, 7], [], ...]
  2. Use $concatArrays on the value from step 1
  3. Use $avg on the array from step 2

How should I go about making step 1? The only part I’m not sure about is how to hold a variable in the first grouping stage with the result returned from $function. I understand I need to do something like this:

aggregate([
    {$group: {diffs: {$function: {...}}}}
])

However, I get the error MongoServerError: unknown group operator '$function'.

2

Answers


  1. Chosen as BEST ANSWER

    I ended up getting it working the way I wanted. Sorry if there was some confusion over the data to anyone reading the original question -- the basic gist of the problem was, for each document the function calculates an array which are all numbers (like [1, 2, 3]), and I wanted to make one big array with all the numbers combined from all the documents, then calculate an average with it.

    I got it working by first calculating the array for each document (diffs), unwinding the array, using $push to push each element of the array into a new one, and then finally using $avg on the created array. Hopefully this helps someone with a similar problem down the line.

    db.tickets.aggregate([
        {
            $project: {
                responseTimes: {
                    $function: {
                        body: function(responses) {
                            const diffs=[];
                            let current;
                            for (let i = 0; i < responses.length; i++) {
                                if (i+1>=responses.length) break;
                                if (!current && !responses[i].staff) current = responses[i];
                                if (!current) continue;
                                const next = responses[i+1];
                                if (!next.staff) continue;
                                diffs.push(next.timestamp - current.timestamp);
                                current = undefined;
                            }
                            return diffs;
                        },
                        args: ["$responses"],
                        lang: "js"
                    }
                }
            }
        },
    
        {$unwind: "$responseTimes"},
    
        {
            $group: {
                _id: null,
    
                responseTimes: {
                    $push: "$responseTimes"
                }
            }
        },
    
        {
            $project: {
                avgResponseTime: {
                    $avg: "$responseTimes"
                }
            }
        }
    ])
    

  2. Documentation says:

    Executing JavaScript inside an aggregation expression may decrease performance. Only use the $function operator if the provided pipeline operators cannot fulfill your application’s needs.

    A starting point to achieve the same natively with an aggregation pipeline could be this one. I did not spend much time in testing, but this would be the direction.

    db.collection.aggregate([
      {
        $set: {
          responseTimes: {
            $reduce: {
              input: "$responses",
              initialValue: [],
              in: {
                $cond: {
                  if: { $not: "$$this.staff" },
                  then: { $concatArrays: [ "$$value", [ "$$this.timestamp" ] ] },
                  else: {
                    $cond: {
                      if: { $gt: [ { $size: "$$value" }, 0 ] },
                      then: {
                        $concatArrays: [
                          { $slice: [ "$$value", { $subtract: [ { $size: "$$value" }, 1 ]  } ] },
                          [ { $subtract: [ "$$this.timestamp", { $last: "$$value" } ] } ]
                        ]
                      },
                      else: null
                    }
                  }
                }
              }
            }
          }
        }
      },
      {
        $project: {
          avgResponseTime: { $avg: "$responseTimes" }
        }
      }
    ])
    

    https://mongoplayground.net/p/Ie6IL8atLX-

    A different approach is to use input: { $range: [ 0, {$subtract: [{ $size: "$responses" }, 1]} ] }, and then access the elements with { $arrayElemAt: [ "$responses", "$$this" ] }, maybe see this example which is a similar use-case.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search