skip to Main Content

I’m not sure if it is a real problem or just lack of documentations.
You can put conditions for documents in foreign collection in a lookup $match.
You can also put conditions for the documents of original collection in a lookup $match with $expr.

But when I want to use both of those features, it doesn’t work. This is sample lookup in aggregation

  { $lookup:
    {
      from: 'books',
      localField: 'itemId',
      foreignField: '_id',
      let: { "itemType": "$itemType" },
      pipeline: [
         { $match: { $expr: { $eq: ["$$itemType", "book"] } }}
      ],
      as: 'bookData'
    }
  }

$expr is putting condition for original documents. But what if I want to get only foreign documents with status: 'OK' ? Something like:

{ $match: { status: "OK", $expr: { $eq: ["$$itemType", "book"] } }}

Does not work.

2

Answers


  1. I tried to play with the situation you provided.
    Try to put $expr as the first key of $match object. And it should do the thing.

       { $lookup:
        {
          from: 'books',
          localField: 'itemId',
          foreignField: '_id',
          let: { "itemType": "$itemType" },
          pipeline: [
             { $match: { $expr: { $eq: ["$$itemType", "book"] }, status: 'OK' }}
          ],
          as: 'bookData'
        }
      }
    
    Login or Signup to reply.
  2. The currently accepted answer is "wrong" in the sense that it doesn’t actually change anything. The ordering that the fields for the $match predicate are expressed in does not make a difference. I would demonstrate this with your specific situation, but there is an extra complication there which we will get to in a moment. In the meantime, consider the following document:

      {
        _id: 1,
        status: "OK",
        key: 123
      }
    

    This query:

    db.collection.find({
      status: "OK",
      $expr: {
        $eq: [
          "$key",
          123
        ]
      }
    })
    

    And this query, which just has the order of the predicates reversed:

    db.collection.find({
      $expr: {
        $eq: [
          "$key",
          123
        ]
      },
      status: "OK"
    })
    

    Will both find and return that document. A playground demonstration of the first can be found here and the second one is here.

    Similarly, your original $match:

    { $match: { status: "OK", $expr: { $eq: ["$$itemType", "book"] } }}
    

    Will behave the same as the one in the accepted answer:

    { $match: { $expr: { $eq: ["$$itemType", "book"] }, status: 'OK' }}
    

    Said another way, there is no difference in behavior based on whether or not the $expr is used first. However, I suspect the overall aggregation is not expressing your desired logic. Let’s explore that a little further. First, we need to address this:

    $expr is putting condition for original documents.

    This is not really true. According to the documentation for $expr, that operator "allows the use of aggregation expressions within the query language."

    A primary use of this functionality, and indeed the first one listed in the documentation, is to compare two fields from a single document. In the context of $lookup, this ability to refer to fields from the original documents allows you to compare their values against the collection that you are joining with. The documentation has some examples of that, such as here and other places on that page which refer to $expr.

    With that in mind, let’s come back to your aggregation. If I am understanding correctly, your intent with the { $expr: { $eq: ["$$itemType", "book"] } predicate is to filter documents from the original collection. Is that right?

    If so, then that is not what your aggregation is currently doing. You can see in this playground example that the $match nested inside of the $lookup pipeline does not affect the documents from the original collection. Instead, you should do that filtering via an initial $match on the base pipeline. So something like this:

    db.orders.aggregate([
      {
        $match: {
          $expr: {
            $eq: [
              "$itemType",
              "book"
            ]
          }
        }
      }
    ])
    

    Or, more simply, this:

    db.orders.aggregate([
      {
        $match: {
          "itemType": "book"
        }
      }
    ])
    

    Based on all of this, your final pipeline should probably look similar to the following:

    db.orders.aggregate([
      {
        $match: {
          "itemType": "book"
        }
      },
      {
        $lookup: {
          from: "books",
          localField: "itemId",
          foreignField: "_id",
          let: {
            "itemType": "$itemType"
          },
          pipeline: [
            {
              $match: {
                status: "OK"
              }
            }
          ],
          as: "bookData"
        }
      }
    ])
    

    Playground example here. This pipeline:

    1. Filters the data in the original collection (orders) by their itemType. From the sample data, it removes the document with _id: 3 as it has a different itemType than the one we are looking for ("book").
    2. It uses the localField/foreignField syntax to find data in books where the _id of the books document matches the itemId of the source document(s) in the orders collection.
    3. It further uses the let/pipeline syntax to express the additional condition that the status of the books document is "OK". This is why books document with the status of "BAD" does not get pulled into the bookData for the orders document with _id: 2.

    Documentation for the (combined) second and third parts is here.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search