skip to Main Content

I was using $dateTrunc on a timeseries collection to group the timeseries data from 2024-12-01 to 2024-12-07 into a single bin.

Here is the aggregate pipeline:

[
  {
    $match: {
      timestamp: {
        $gte: ISODate('2024-12-01'),
        $lte: ISODate('2024-12-07'),
      },
    }
  },
  {
    $group: {
      _id: {
        timestamp: {
          $dateTrunc: {
            date: "$timestamp",
            unit: 'day',
            binSize: 7,
          },
        }
      },
      count: {
        $sum: 1
      }
    }
  },
  {
    $set: {
      timestamp: "$_id.timestamp"
    }
  },
  {
    $sort: {
      timestamp: 1
    }
  }
]

Here is the result:

[
  {
    "_id": {
      "timestamp": {
        "$date": "2024-11-30T00:00:00.000Z"
      }
    },
    "count": 719,
    "timestamp": {
      "$date": "2024-11-30T00:00:00.000Z"
    }
  },
  {
    "_id": {
      "timestamp": {
        "$date": "2024-12-07T00:00:00.000Z"
      }
    },
    "count": 116,
    "timestamp": {
      "$date": "2024-12-07T00:00:00.000Z"
    }
  }
]

I expect the aggregation pipeline to return a single data point that sum up 7 days value (with a timestamp starting from 2024-12-01). However, the result is not what I expected.

It would be great if anyone could provide guidance on this issue. Thank you in advance!

enter image description here
enter image description here
enter image description here

UPDATE
Some information about timeseries collection:

  • timeField -> timestamp
  • metaField -> tags
  • expireAfterSeconds -> undefined
  • granularity -> Hours
  • bucketMaxSpanSeconds -> undefined

2

Answers


  1. Obviously 1970-01-01 is not the starting point. When you use unit: 'day' with binsize, the starting point seems to be quite random and not deterministic:

    for (let i = 1; i <= 20; i++) {
       const ret = db.aggregate([
          { $documents: [ {timestamp: ISODate("2024-12-01")} ] },
          { $project: {
             _id:0, 
             binSize: `${i}`, 
             day: { 
               $dateToString: { 
                 date: { 
                   $dateTrunc: { 
                     date: "$timestamp", 
                     unit: "day", 
                     binSize: i } 
                   }, 
                 format: "%u %j" } 
               } 
             }
          }
       ]).toArray().shift();print(ret);
    }
    
    { binsize: '1', day: '7 336' }
    { binsize: '2', day: '6 335' }
    { binsize: '3', day: '5 334' }
    { binsize: '4', day: '6 335' }
    { binsize: '5', day: '6 335' }
    { binsize: '6', day: '2 331' }
    { binsize: '7', day: '6 335' }
    { binsize: '8', day: '2 331' }
    { binsize: '9', day: '5 334' }
    { binsize: '10', day: '6 335' }
    { binsize: '11', day: '3 332' }
    { binsize: '12', day: '2 331' }
    { binsize: '13', day: '6 335' }
    { binsize: '14', day: '6 335' }
    { binsize: '15', day: '3 325' }
    { binsize: '16', day: '1 323' }
    { binsize: '17', day: '1 330' }
    { binsize: '18', day: '3 325' }
    { binsize: '19', day: '7 336' }
    { binsize: '20', day: '6 335' }
    

    I don’t see any rule!

    Depending on your requirements, you need either unit: 'week', startOfWeek: 'monday' or you may have a look at $setWindowFields if you are looking for a sliding window.

    Login or Signup to reply.
  2. The $dateTrunc operator calculates the bin for the specified date according to its offset from a base date. The base date will be 2000-01-01 for any unit except "week".

    The basic process would be:

    • subtract the base date from the date in question – will give the number of milliseconds
    • divide the number of milliseconds by the bin size in milliseconds, rounding down
    • add the result to the base date

    Using the test sequence lifted from another answer, this method gives the same result:

    for (let i = 1; i <= 20; i++) {
       const ret = db.aggregate([
          { $documents: [ {timestamp: ISODate("2024-12-01"),basedate: ISODate("2000-01-01")} ] },
          { $addFields: {
             ms: {$dateDiff:{startDate:"$basedate",endDate:"$timestamp",unit:"millisecond"}},
             binms: i * 86400000
          }},
          { $addFields: {
            newdate: {$dateAdd:{
                startDate: "$basedate",
                unit: "millisecond",
                amount: {$multiply: [ "$binms", {$trunc: {$divide: ["$ms", "$binms"] }} ] }
            }}
          }},
          { $project: {
             _id:0, 
             binSize: `${i}`, 
             day: { 
               $dateToString: { 
                 date: "$newdate", 
                 format: "%u %j" 
               } 
             }, 
             binDate: "$newdate"
          }}
       ]).toArray().shift();
       print(ret);
    }
    
    { "binSize":"1", "day":"7 336", "binDate":"2024-12-01T00:00:00.000Z"}
    { "binSize":"2", "day":"6 335", "binDate":"2024-11-30T00:00:00.000Z"}
    { "binSize":"3", "day":"5 334", "binDate":"2024-11-29T00:00:00.000Z"}
    { "binSize":"4", "day":"6 335", "binDate":"2024-11-30T00:00:00.000Z"}
    { "binSize":"5", "day":"6 335", "binDate":"2024-11-30T00:00:00.000Z"}
    { "binSize":"6", "day":"2 331", "binDate":"2024-11-26T00:00:00.000Z"}
    { "binSize":"7", "day":"6 335", "binDate":"2024-11-30T00:00:00.000Z"}
    { "binSize":"8", "day":"2 331", "binDate":"2024-11-26T00:00:00.000Z"}
    { "binSize":"9", "day":"5 334", "binDate":"2024-11-29T00:00:00.000Z"}
    { "binSize":"10", "day":"6 335", "binDate":"2024-11-30T00:00:00.000Z"}
    { "binSize":"11", "day":"3 332", "binDate":"2024-11-27T00:00:00.000Z"}
    { "binSize":"12", "day":"2 331", "binDate":"2024-11-26T00:00:00.000Z"}
    { "binSize":"13", "day":"6 335", "binDate":"2024-11-30T00:00:00.000Z"}
    { "binSize":"14", "day":"6 335", "binDate":"2024-11-30T00:00:00.000Z"}
    { "binSize":"15", "day":"3 325", "binDate":"2024-11-20T00:00:00.000Z"}
    { "binSize":"16", "day":"1 323", "binDate":"2024-11-18T00:00:00.000Z"}
    { "binSize":"17", "day":"1 330", "binDate":"2024-11-25T00:00:00.000Z"}
    { "binSize":"18", "day":"3 325", "binDate":"2024-11-20T00:00:00.000Z"}
    { "binSize":"19", "day":"7 336", "binDate":"2024-12-01T00:00:00.000Z"}
    { "binSize":"20", "day":"6 335", "binDate":"2024-11-30T00:00:00.000Z"}
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search