skip to Main Content

Currently, I have 3 different aggregate queries which generate different counts based on grouping of companyRegNo. Is there a way to combine these 3 queries.

The employeeActions data looks like this:

{   
    "email": "[email protected]",
    "companyRegNo" : 105,    
    "event" : {
        "created" : ISODate("2022-09-16T06:42:42.761Z"),
        "desc" : "COMPLETED_APPLICATIONS",
        "note" : "Direct apply",       
    }
}
{   
    "email": "[email protected]",
    "companyRegNo" : 247,
    "event" : {
        "created" : ISODate("2022-09-16T06:42:04.387Z"),       
        "desc" : "COMPLETED_APPLICATIONS",
        "note" : "",       
    }
}
{   
    "email": "[email protected]",
    "companyRegNo" : 247,    
    "event" : {
        "created" : ISODate("2022-09-16T06:42:42.761Z"),
        "desc" : "COMPLETED_REVIEW",
        "note" : ""Sent for review"",       
    }
}
{   
    "email": "[email protected]",
    "companyRegNo" : 105,    
    "event" : {
        "created" : ISODate("2022-09-16T06:42:42.761Z"),
        "desc" : "COMPLETED_REVIEW",
        "note" : "Data is complete",       
    }
}
{   
    "email": "[email protected]",
    "companyRegNo" : 247,
    "event" : {
        "created" : ISODate("2022-09-16T06:42:04.387Z"),       
        "desc" : "COMPLETED_OFFER",
        "note" : "",       
    }
}
{   
    "email": "[email protected]",
    "companyRegNo" : 227,
    "event" : {
        "created" : ISODate("2022-09-16T06:42:04.387Z"),       
        "desc" : "COMPLETED_APPLICATIONS",
        "note" : "",       
    }
}
{   
    "email": "[email protected]",
    "companyRegNo" : 227,    
    "event" : {
        "created" : ISODate("2022-09-16T06:42:42.761Z"),
        "desc" : "COMPLETED_APPLICATIONS",
        "note" : "",       
    }
}

{   
    "email": "[email protected]",
    "companyRegNo" : 105,    
    "event" : {
        "created" : ISODate("2022-09-16T06:42:42.761Z"),
        "desc" : "COMPLETED_APPLICATIONS",
        "note" : "Direct apply",       
    }
}

The 3 different queries are:

Query for Review count

db.getCollection('employeeActions').aggregate([
{"$match": { 
    "event.created": {"$gte": ISODate("2022-06-01 00:00:00.000Z")},
    "$or": [ {"event.desc": "COMPLETED_REVIEW"}, {"event.note": "Sent for review"}],
}},
{"$group":{"_id":"$companyRegNo","count": {"$sum": 1 } } },
{"$project":{ "companyRegNo":"$_id","count": "$count","_id":0}}
])

The result will be

105    1  
227    0  
247    1  

Query for Offer count

db.getCollection('employeeActions').aggregate([
{"$match": { 
    "event.created": {"$gte": ISODate("2022-06-01 00:00:00.000Z")},
    "event.desc": "COMPLETED_OFFER"
}},
{"$group":{"_id":"$companyRegNo","count": {"$sum": 1 } } },
{"$project":{ "companyRegNo":"$_id","count": "$count","_id":0}},
])

The result will be

105    0  
227    0  
247    1  

Query for Applications count

db.getCollection('employeeActions').aggregate([
{"$match": { 
    "event.created": {"$gte": ISODate("2022-06-01 00:00:00.000Z")},
    "event.desc": "COMPLETED_APPLICATIONS"
}},
{"$group":{"_id":"$companyRegNo","count": {"$sum": 1 } } },
{"$project":{ "companyRegNo":"$_id","count": "$count","_id":0}},
])

The result will be

105    2
227    2
247    1

Is there a way to combine these 3 queries so that it gives me the results as below?

CompanyRegNo Applications Reviews Offers

105            2           1         0
227            2           0         0
247            1           1         1

I tried using $facet to combine these queries. It gives me a combined 3 counts across all companies.

"offers" : 1, "reviews" : 2, "applications" : 5 

I need the data to be split across companies.

db.getCollection('employeeActions').aggregate([
    { "$match": { "event.created": {"$gte": ISODate("2022-01-01 00:00:00.000Z")}}},
    { "$facet": {                            
          "offers": [                           
            {"$match":{ "event.desc": "COMPLETED_OFFER" } },
            {"$group":{"_id":"$companyRegNo","count": {"$sum": 1 } } },
            { "$count": "offers" },
          ],
          "reviews": [                           
            {"$match":{ 
                "$or": [ {"event.desc": "COMPLETED_REVIEW"}, {"event.note": "Sent for review"}],
               }},
            {"$group":{"_id":"$companyRegNo","count": {"$sum": 1 }} },
            { "$count": "reviews" },
          ],
          "applications": [                           
            {"$match":{   "event.desc": "COMPLETED_APPLICATIONS" } },
            {"$group":{"_id":"$companyRegNo","count":{"$sum": 1 }} },
            { "$count": "applications" },
          ]    
       }
    },
    { "$project": {
        "offers": { "$arrayElemAt": ["$offers.offers", 0] },
        "reviews": { "$arrayElemAt": ["$reviews.reviews", 0] },
        "applications": { "$arrayElemAt": ["$applications.applications", 0] }
      }}
 ])```

2

Answers


  1. Yes, you can use $facet stage to process multiple aggregation pipelines in the same time:

    { 
      $facet: {
         count_1: [ // Your first query ],
         count_2: [ // Your second query ],
         count_3: [ // Your third query ],
      }
    }
    
    Login or Signup to reply.
  2. While the answer by @NeNaD is considered as the "best practice" (and I vote for it), since $facet was created for such cases, it is important to know, there are alternatives. There are disadvantages in using $facet, one of them, is that it "merges" all your documents into one big document, and documents have a size limit. If your collection is large, and many documents will match your first condition, you may want to consider other methods.

    One alternative is using $group with $cond:

    db.collection.aggregate([
      {$match: {
          "event.created": {$gte: ISODate("2022-06-01T00:00:00.000Z")}
      }},
      {$group: {
          _id: "$companyRegNo",
          ReviewCount: {
            $sum: {$cond: [
                {$or: [
                    {$eq: ["$event.desc", "COMPLETED_REVIEW"]},
                    {$eq: ["$event.note", "Sent for review"]}
                  ]
                }, 1, 0]}
          },
          OfferCount: {
            $sum: {$cond: [{$eq: ["$event.desc", "COMPLETED_OFFER"]}, 1, 0]}
          },
          ApplicationsCount: {
            $sum: {$cond: [{$eq: ["$event.desc", "COMPLETED_APPLICATIONS"]}, 1, 0]}
          }
        }
      }
    ])
    

    See how it works on the playground example

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search