db.inventory.aggregate( [
{
$lookup:
{
from: "order",
localField: "_id",
foreignField: "item_id",
as: "inventory_docs"
}
}
] )
The $lookup
is joining based on the item_id
field, which is indexed.
If 100,000 documents pass through this $lookup
, it increased the query time to 4X than without this $lookup
.
Given that the $lookup
is indexed, it is unexpected that the query will be slower by 4X. I was expecting a marginal increase in query time.
Is this also the case for SQL databases? Will an indexed join increase query time by 4X?
EDIT:
explain doc:
{
"$lookup": {
"from": "order",
"as": "inventory_docs",
"localField": "_id",
"foreignField": "item_id",
"let": {},
"pipeline": [
{
"$project": {
"_id": 1
}
}
]
},
"totalDocsExamined": 0,
"totalKeysExamined": 100008,
"collectionScans": 0,
"indexesUsed": [
"_id_"
],
"nReturned": 100008,
"executionTimeMillisEstimate": 18801
}
So it took 18 seconds to lookup 100,000 documents with an indexed field, _id
. This seems really slow.
2
Answers
i suggest you run an explanation of that aggregation.
Examine if the index is being used and make sure if the lookup is what taking all that extra time.
you can share the explain logs here to provide more context
you can find the docs for explaining your aggregation pipelines :
https://www.mongodb.com/docs/manual/reference/method/db.collection.explain/
There is no general answer to your question.
Let’s have a look at Oracle history. In earlier times (i.e. prior Oracle 8i which was released 1997) the execution path was defined in rule-based-optimizer. The query was analyzed and existence of indexes was checked. Based on this information the execution path was selected.
Today the Oracle optimizer also considers the data itself, using statistics. MongoDB
$lookup
would be a OUTER JOIN, Oracle knows 5 different OUTER Join Types. The Oracle Optimizer fills entire books.Also MongoDB is constantly improving their data access methods, see for example Slot-Based Query Execution Engine
Your question is way too broad!