I have 2 collections users
and files
. A user document looks like this :
{
"_id": "user1",
"name": "John"
}
A file
document looks like this :
{
"_id": "fileID1",
"url": "any_url",
"userID": "user1"
}
I have a list of user IDs let’s say : ["user1", "user2", "user5"]
and i want to perform a single query that takes this list
as input and returns 5 files of each user ID.
For now i have a naive logic that is not really performant :
result_users = users_collection.find({}).limit(15)
for user in result_users:
user_id.append(user["_id"])
result_files = files_collection.find({"userID": user["_id"]}).limit(5)
for f in result_files:
file_dict[user["_id"]].append(f)
The query i am looking for would naively be something like :
result_files = files_collection.find({"userID": {"$in / $each": user_ids_list}).limit(5)
The output i am looking for is :
{
"user1": ["fileID1", "fileID2", "fileID8", "fileID3"],
"user2": ["fileID4", "fileID5", "fileID6", "fileID9", "fileID7"],
...
}
With at max 5 files per userID.
I am pretty sure this is achievable with aggregation
but i have no clue how to write it correctly.
Note : i don’t really care about the output format, i only need a list of file that is sorted correctly with 5 files per userID.
I can map the output myself when i am confident the query returns the right files
.
Thanks !
Edit : it would be amazing to add a sort("at", DESCENDING) on the files collection aggregation
2
Answers
That’s in JS, but I am sure in py you could do something like below:
This should give you an output similar to:
since you already have the user names in the file docs you won’t need to query user collection.
$match
based on existence of user in input arrayuserID
field and push the file id and the timestamp (will need later for sorting)$sortArray
on the grouped file array (v). Now it is sorted in descending(-1). On that sorted call$slice
and finally map over the sliced array to get an array of just the ids.this will now generate in the format
[ { _id:"user1",v: [...]}, { _id:"user2",v: [...]}... ]
playground
if you want to further convert it to
{ "user1": [...], "user2": [...], ... }
format you can add these 2 stagesplayground