skip to Main Content

I have built an analytic system using mongodb and running mongodb aggregation pipeline and it is scanning all the documents instead of using the index.. here is my query..

const collection = mongoose.connection.db.collection('analytics');

  const pipeline = [
    {
      $facet: {
        last1Min: [
          {
            $match: match,
          },

          {
            $group: {
              _id: '$visit_uid',
              url: {$first: '$url'},
              page_title: {$first: '$page_title'},
              flag: {$first: '$flag'},
              country_name: {$first: '$country_name'},
              created_at: {$first: '$created_at'},
              user: {$first: '$user'},
            },
          },
          {
            $project: {
              _id: 1,
              visit_uid: '$_id',
              url: 1,
              flag: 1,
              created_at: 1,
              user: 1,
              country_name: 1,
              page_title: 1,
            },
          },
          {
            $sort: {
              created_at: -1, // Sort by created_at field in descending order (most recent first)
            },
          },
        ],
        last5Mins: [
          {
            $match: last5MinsMatch,
          },

          {
            $group: {
              _id: '$visit_uid',
              url: {$first: '$url'},
              page_title: {$first: '$page_title'},
              id: {$first: '$id'},
              flag: {$first: '$flag'},
              country_name: {$first: '$country_name'},
              created_at: {$first: '$created_at'},
              user: {$first: '$user'},
            },
          },
          {
            $project: {
              _id: 0,
              visit_uid: '$_id',
              url: 1,
              id: 1,
              flag: 1,
              created_at: 1,
              user: 1,
              country_name: 1,
              page_title: 1,
            },
          },
          {
            $sort: {
              created_at: -1, // Sort by created_at field in descending order (most recent first)
            },
          },
        ],
      },
    },
  ];

  return collection.aggregate(pipeline).explain("executionStats")

We have documents more than 2139951 and it scan all.. Index was created for school_id where each school has only around 500-1000 records..
Matching objects are

 {
        "match": {
        "school_id": 2460,
        "created_at": {
        "$gte": "2024-03-02T09:53:29.828Z"
    },
    "user": {
        "$exists": true
    }
    },
    "last5MinsMatch": {
        "school_id": 2460,
        "created_at": {
        "$gte": "2024-03-02T09:49:29.828Z",
        "$lt": "2024-03-02T09:53:29.829Z"
    },
    "user": {
            "$exists": true
        }
        }
    }

enter image description here

Anyone could help?

Thank you.

2

Answers


  1. enter image description here

    I found an answer is talked about operations support to use index and it doesn’t include $faceit operation.

    Login or Signup to reply.
  2. Try to put this stage at the top, it may improve the performance

    {$match: {$or: [match, last5MinsMatch]}}
    

    You may have to enclose it by $expr, depending on the condition.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search