Context :
I’m trying to build an architecture displaying POIs which can be at different know locations over time.
I have 2 collections,
pois
{
_id: ObjectId,
name: string
}
locations
_id: ObjectId,
point: {
type: 'Point',
coordinates: Array<number>
},
poi: ObjectId // Reference to Poi
Use-case :
So I’m trying to build a query which
- takes a center coordinate + radius in input
- and return the matching Pois within than radius
- with only their nearest Location
- sorted by distance
Ideally, having this output :
[
{
_id: ObjectId(AE54232),
name: 'Some poi',
location: {
_id: ObjectId(BFE5423),
point: {
type: 'Point',
coordinates: [3, 50]
},
distance: 3
}
}
]
Attempt
Reading carefully at the docs, I used this combination :
// Keep only locations within radius,
// output 'distance'
// and sort by distance
{
$geoNear: {
near: nearCenter,
key: 'point',
distanceField: 'distance',
maxDistance: nearRadius,
spherical: true,
},
},
// Keep only first (assumed 'nearest')
// location of each poi
{
$group: {
_id: '$poi',
location: {
$first: '$$ROOT'
}
}
},
// Retrieve poi
{
$lookup: {
from: 'pois',
localField: '_id',
foreignField: '_id',
as: 'poi',
},
},
// Flatten poi
{
$unwind: {
path: '$poi',
},
},
// Push poi at the root,
// and put location inside 'location'
{
$replaceRoot: {
newRoot: {
$mergeObjects: [
"$poi",
{ location: "$location" },
]
},
}
},
So to sumup :
$geoNear
$first(by poi)
$lookup(poi)
$unwind(poi)
$replaceRoot(poi { location })
Trouble
I’m facing a strange behavior where the query basically works; excepts its not sorted by distance : the poi
s with their location
come in an erratic and non-determinist order !
I tried commenting every step one by one, and apparently this is the $first
which is causing the "shuffle". Which is surprising since docs states :
Outputs documents in order of nearest to farthest from a specified point.
Returns the value that results from applying an expression to the first document in a group of documents. Only meaningful when documents are in a defined order.
Fix attempts
My thought was $first
expects an actual $sort
and not an implicit $geoNear
sorting; so I tried to intercalate a $sort
in between like so :
{
$sort: {
'distance': 1,
},
},
in between like so :
$geoNear
$sort(distance)
<== HERE$first(by poi)
$lookup(poi)
$unwind(poi)
$replaceRoot(poi { location })
But it gave me the exact same result !
The only thing that worked is adding a $sort
at the very end like so
{
$sort: {
'location.distance': 1,
},
},
$geoNear
$first(by poi)
$lookup(poi)
$unwind(poi)
$replaceRoot(poi { location })
$sort(location.distance)
<== HERE
But I’m concerned it could have performance issues on large datasets
Question
Is there any way to accomplish that logic
- filter $geoNear (keep distance)
- $group by referenced document, keep only ‘nearest’
Without loosing $geoNear order ?
2
Answers
To expand on @nimrod serok's accepted answers
I add an explaination on the "why" of this (too long for a comment).
Statement
A
$geoNear
then$group
($first
) onpoi
don't necessarily lead topoi
s sorted by distance.The reason is simple, but MongoDb doc is a bit unclear about it :
$first
This doesn't mean that "order of groups would be kept consistent"; this means that the attribution of the
$first
of each group will be consistent, only if it's sorted in input.The key is in the "note" in the middle of the docs page :
This basically means that order of input is respected for the
$first
resolution; but the order of groups themselves is not consistent.Example case
Say this is the result of $geoNear
$group($first(poi))
guarantees that location 1 will be kept and location 3 discarded; but does not guarantees that location 1 will be output before location 2.The following result is then legit :
This is by nature.
If each
poi
can have few locations then grouping them may change the order so the documents after the grouping are no longer sorted bydistance
. You can sort bydistance
after the grouping to solve it: