I am trying to search my MongoDB of products. The dataset has multiple of each product to record price over time. I would like to search for a phrase then limit the results to 1 of each UPC. My current code works well but will return multiple of the same UPC.
Current Code, will return multiple of the same UPC:
response = self.DB.find({'$text': {'$search': f'/{search}/'}}, {'Response': 0, '_id': 0}).sort("timestamp", -1)
Example Data Set:
{
"_id": {
"$oid": "64cf05707844ef1a25ee57fa"
},
"upc": "032622013625",
"name": "Luigi Bormioli Michelangelo Beverage 20oz Set of 4",
"salePrice": 29.99,
"timestamp": "2023-08-05 22:29:04 EDT-0400",
}
}
{
"_id": {
"$oid": "64cf057c7844ef1a25ee57fd"
},
"upc": "048894970887",
"name": "Basic Window Fan - Holmes",
"salePrice": 54.99,
"available": false,
"timestamp": "2023-08-05 22:29:16 EDT-0400",
}
}
}
{
"_id": {
"$oid": "64cf05707844ef1a25ee57fa"
},
"upc": "032622013625",
"name": "Luigi Bormioli Michelangelo Beverage 20oz Set of 4",
"salePrice": 29.97,
"timestamp": "2023-08-04 13:25:09 EDT-0400",
}
}
Not sure if I should be using distinct, or find?
2
Answers
I filtered the array of results by using a dictionary storing the upc id, and append into a list of documents if upc id is not existing.
[{‘upc’: ‘032622013625’, ‘name’: ‘Luigi Bormioli Michelangelo Beverage 20oz Set of 4’, ‘salePrice’: 29.99, ‘timestamp’: ‘2023-08-05 22:29:04 EDT-0400’}]
You could use
"$top"
with"$group"
in an aggregation pipeline to get your result. If you only want certain fields returned, you could use a"$project"
stage.