In my mongodb collection documents are stored in the following format:
{ "_id" : ObjectId("62XXXXXX"), "res" : 12, ... }
{ "_id" : ObjectId("63XXXXXX"), "res" : 23, ... }
{ "_id" : ObjectId("64XXXXXX"), "res" : 78, ... }
...
I need to extract id’s for the document for which the value of "res" is outlier (i.e. value < Q1 – 1.5 * IQR or value > Q3 + 1.5 * IQR (Q1, Q3 are percentiles)). I have done this using pandas functionality by retrieving all documents from the collection, which may become slow if the number of documents in collection become too big.
Is there a way to do this using mongodb aggregation pipeline (or just calculating percentiles)?
2
Answers
If I understand how you want to retrieve outliers, here’s one way you might be able to do it.
Try it on mongoplayground.net.
One more option based on @rickhg12hs’s answer, is to use
$setWindowFields
:See how it works on the playground example