skip to Main Content

I have Millions of documents in the mongodb collection.

I want to group the documents based on minute and hour interval.
I created the index for timestamp.
I tried below mentioned query, but it’s took too much time or timeout.

[ {
  _id: {
    interval: {
      $minute: {
        $toDate: {
          $multiply: ["$timestamp", 1000],
        },
      },
    },
  },
  timestamp: {
    $first: "$timestamp",
  },
  count: {
    $sum: 1,
  },
} ]

How can i resolve the issue.

I tried with **skip **and **limit **also still i’m facing same issue.

2

Answers


  1. Please check this, may it will helpful.

    {
      _id: {
        interval: {
          $dateToString: { format: "%Y-%m-%d %H:%M", date: { $toDate: "$timestamp" } 
    }
        }
      },
      timestamp: { $first: "$timestamp" },
      count: { $sum: 1 }
    }
    
    Login or Signup to reply.
  2. To group millions of documents based on intervals using MongoDB, you can utilize the aggregation framework and the $bucket operator. The $bucket operator allows you to group documents into buckets based on specified boundaries.

    Here’s an example of how you can use the $bucket operator to group documents based on intervals:

    db.collection.aggregate([
      {
        $bucket: {
          groupBy: "$fieldToGroupBy",
          boundaries: [interval1, interval2, interval3, ...],
          default: "Other",
          output: {
            "count": { $sum: 1 },
            "documents": { $push: "$$ROOT" }
          }
        }
      }
    ])
    

    In the above example:

    • collection is the name of your MongoDB collection.
    • fieldToGroupBy is the field in your documents that you want to group by.
    • interval1, interval2, interval3, etc. are the boundaries for your intervals. You can specify any number of boundaries.
    • "Other" is the label for the documents that do not fall into any of the specified intervals.
    • "count" is the field that will hold the count of documents in each interval.
    • "documents" is an array that will hold the actual documents in each interval.

    Make sure to replace collection, fieldToGroupBy, and the interval boundaries with your actual values.

    This query will return the grouped documents along with the count of documents in each interval. You can modify the output fields as per your requirements.

    The $bucket operator is available starting from MongoDB version 3.4.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search