skip to Main Content

Is there a MongoDB aggregation query equivalent to the following SQL statement using a single MongoDB aggregation? (The two operands of the JOIN are two subqueries that also include join operations.)

SELECT 
    T1.C1, T2.C2 
FROM
    (SELECT 
         T3.C1 C1, T4.C2 C2 
     FROM 
         T3 
     JOIN 
         T4 ON T3.C0 = T4.C0) T1
JOIN
    (SELECT 
         T5.C1 C1, T6.C2 C2 
     FROM 
         T5 
     JOIN 
         T6 ON T5.C0 = T6.C0) T2 ON T1.C1 = T2.C1 
                                 AND T1.C2 = T2.C2;

2

Answers


  1. Short answer – no, lookup requires "from" to be a collection.

    As WD pointed out, SQL patterns do not work in NoSQL universe, yet database is a database – if data is there you can always fetch it. The logic will be inverted – you will do SELECT T3.C1, T6.C2 FROM T3 and complex lookups/unwinds from T4, T5 and T6.

    Considering the OP’s snippet, it would be fair to assume the question is more academic than practical. From this perspective – it’s still "no". Don’t design/denormalise data to the degree that you find yourself in the position where you need to run such query.

    Login or Signup to reply.
  2. This does not exactly address the OP question re. subqueries but show how, if absolutely necessary, "temporary" collections can be used in lieu of subqueries. Note that in the original SQL, only tables T3.T4,T5, and T6 actually exist; T1 and T2 are subquery constructs. Also, the original SQL is not particularly useful because it only shows matching columns; we know nothing else about the rows from whence they came, so we will carry along old_id just for show.

    
    db.XT3.drop();
    var r = [
        {C0: "A", C1: "R0-1", C2: "R0-2"}
        ,{C0: "A", C1: "R1-1", C2: "R1-2"}
        ,{C0: "B", C1: "R0-1", C2: "R0-2"}
        ,{C0: "C", C1: "R0-1", C2: "R0-2"}
    ];
    var nn = 0; r.forEach(function(d) { d['_id'] = nn++ }); // auto incr ID
    db.XT3.insertMany(r);
    
    db.XT4.drop();
    var r = [
        {C0: "A", C1: "R0-1", C2: "R0-2"}
        ,{C0: "A", C1: "R2-1", C2: "R2-2"}
        ,{C0: "B", C1: "R0-1", C2: "R0-2"}
        ,{C0: "no_match", C1: "R0-1", C2: "R0-2"}
    ];
    r.forEach(function(d) { d['_id'] = nn++ }); // auto incr ID
    db.XT4.insertMany(r);
    
    
    db.XT5.drop();
    var r = [
        {C0: "X", C1: "nope", C2: "nope"}
        ,{C0: "X", C1: "R1-1", C2: "R1-2"}
        ,{C0: "Y", C1: "R0-1", C2: "R0-2"}
        ,{C0: "Z", C1: "R0-1", C2: "R0-2"}
    ];
    r.forEach(function(d) { d['_id'] = nn++ }); // auto incr ID
    db.XT5.insertMany(r);
    
    db.XT6.drop();
    var r = [
        {C0: "X", C1: "R0-1", C2: "R0-2"}
        ,{C0: "Y", C1: "R0-1", C2: "R0-2"}
        ,{C0: "oiwejfioj", C1: "R0-1", C2: "R0-2"}
    ];
    r.forEach(function(d) { d['_id'] = nn++ }); // auto incr ID
    db.XT6.insertMany(r);
    
    
    
    db.TMP1.drop();
    db.TMP2.drop();
    
    c=db.XT3.aggregate([
        {$lookup: {"from": "XT4",
               let: { id: "$C0" },
               pipeline: [
               {$match: {$expr: {$eq: [ "$C0", "$$id" ]} }}
               ,{$project: {_id:false,C0:false}} // no need to carry them
               ],
               as: "T_3_4"
              }}
    
        // This ALSO acts to filter out items from XT3 with NO match to XT4:
        ,{$unwind: '$T_3_4'}
    
        // $unwind will dupe _id of inbound XT3 record; must get rid of it
        // in prep for $out; keep it around for debugging/tracking as 'old_id':
        ,{$project: {_id:false, old_id:'$_id', C1:'$C1', C2:'$T_3_4.C2'}}
        ,{$out: "TMP1"}
    ]);
    
    db.XT5.aggregate([
        {$lookup: {"from": "XT6",
               let: { id: "$C0" },
               pipeline: [
               {$match: {$expr: {$eq: [ "$C0", "$$id" ]} }}
               ,{$project: {_id:false,C0:false}} // no need to carry them
               ],
               as: "T_5_6"
              }}
        ,{$unwind: '$T_5_6'}
        ,{$project: {_id:false, old_id:'$_id', C1:'$C1', C2:'$T_5_6.C2'}}    
        ,{$out: "TMP2"}
    ]);
    
    c = db.TMP1.aggregate([
        {$lookup: {"from": "TMP2",
               let: { c1: "$C1", c2: "$C2" },
    
               // ON T1.C1 = T2.C1 AND T1.C2 = T2.C2;
               pipeline: [
               {$match: {$expr: {$and: [
                   {$eq: [ "$$c1", "$C1" ]},
                   {$eq: [ "$$c2", "$C2" ]}
               ]} }}
               ,{$project: {_id:false}}
               ],
               as: "X"
              }}
    
        // Filter out non-matches; comment out this stage to double check
        // docs where T1.C1 = T2.C1 AND T1.C2 = T2.C2 is NOT satisfied:
        ,{$match: {$expr: {$ne:[0, {$size:'$X'}]} }}
    ]);
    emit(c);
    

    yields

    {
      _id: ObjectId("655e1e5a4e762532ed9ba7aa"),
      old_id: 0,
      C1: 'R0-1',
      C2: 'R0-2',
      X: [
        {
          old_id: 10,
          C1: 'R0-1',
          C2: 'R0-2'
        }
      ]
    }
    {
      _id: ObjectId("655e1e5a4e762532ed9ba7ac"),
      old_id: 1,
      C1: 'R1-1',
      C2: 'R0-2',
      X: [
        {
          old_id: 9,
          C1: 'R1-1',
          C2: 'R0-2'
        }
      ]
    }
    {
      _id: ObjectId("655e1e5a4e762532ed9ba7ae"),
      old_id: 2,
      C1: 'R0-1',
      C2: 'R0-2',
      X: [
        {
          old_id: 10,
          C1: 'R0-1',
          C2: 'R0-2'
        }
      ]
    }
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search