I am trying to find the max of a value in a range of dates. The aggregate query I use has a match on indexed column _id. But the query takes too long and the explain plan tells me its going of a COLLSCAN and not an index scan. Can you please suggest why it wont make use of the index on _id?
Would it help if I created another index on colId?
{$match:{_id:{ $regex: 'regex'}}},
{$match:{$and:[{"colId":'DATA'}]}},
{$unwind:"$data"},
{$match:{$and:[{"data.time":{$gte:ISODate("xyz"),$lte:ISODate("zyx")}}]}},
{$match:{$and: [{ "data.col": { $exists: true}}] }},
{$group:{_id:"$data.time",maxCol:{$max:"$data.col"}}} ,
{$sort:{"maxCol":-1,_id:-1}},
{$limit:1}
])
Explain plan snippet:
"winningPlan" : {
"stage" : "LIMIT_SKIP",
"inputStage" : {
"stage" : "SORT",
"sortPattern" : {
"_id" : -1,
"maxCol" : -1
},
"inputStage" : {
"stage" : "SUBSCAN",
"inputStage" : {
"stage" : "HASH_AGGREGATE",
"inputStage" : {
"stage" : "SUBSCAN",
"inputStage" : {
"stage" : "PROJECTION",
"inputStage" : {
"stage" : "COLLSCAN"
}
}
}
}
}
}
This is on DocumentDB (mongo4)
2
Answers
DocumentDB does not seem to support index scan automatically when there is a regular expression match involved in the indexed column. Document suggests that we use hints to nudge AWS to use those indexes (wonder why the spoon-feeding).
db.collection.aggregate([all_your_filters], {hint: {_id: 1}});
But in my case adding the hint threw errors because it did not like me using non-capturing groups (?:x) in my regex. So I had to remove groups from my regex for the hints to work.
I think regular expression cannot use indexes.
$match
works also on array, try this one:As consequence put an index on
{colId: 1, "data.time": 1}
or{colId: 1, "data.time": 1, "data.col": 1}