skip to Main Content

I have an ip address collection:

{  
 "_id" : "uezyuLx4jjfvcqN",   
  "CVE" : ["CVE2020-123", "CVE2022-789", "CVE2019-456"],   
  "ip" : "1.2.3.4"   
}
{  
  "_id" : "dCC8GrNdEjym3ryua",   
  "CVE" : ["CVE2020-123", "CVE2021-469"],  
  "ip" : "5.6.7.8"  
}
{  
  "_id" : "dCC8GrNdEjym3ryua",   
  "CVE" : ["CVE2020-123", "CVE2021-469"],  
  "ip" : "7.6.7.6"  
}

I’m trying to calculate the distinct sum of the CVE field, where IPs are in ["5.6.7.8", "1.2.3.4"].

Expected output:

{
 ip: ['1.2.3.4', '5.6.7.8'], 
 sum_distinct_cve:4, 
 CVES: ["CVE2020-123", "CVE2022-789", "CVE2019-456", "CVE2021-469"]
}

So I’m doing the following:

db = db.getSiblingDB("test");
hosts = db.getCollection("test-collection")
hosts.aggregate([
  {$match:
    {"ip": {$in: ["1.2.3.4", "5.6.7.8"]}}},
  {$group: 
    {_id: "$CVE",
    totals: {$sum: "$CVE"}}}
]);

The sum is returning 0, which I’ve realised is because of MongoDb’s behaviour when trying to sum a string field. This is detailed here: mongodb sum query returning zero

What I would like to know though is how I can sum the number of elements, and also find the distinct sum.`

2

Answers


  1. Simple option:

    db.collection.aggregate([
    {
     $match: {
      "ip": {
        $in: [
          "1.2.3.4",
          "5.6.7.8"
        ]
       }
      }
    },
    {
      $unwind: "$CVE"
    },
    {
     $group: {
      _id: "",
      ip: {
        $addToSet: "$ip"
      },
      CVE: {
        $addToSet: "$CVE"
       }
      }
      },
     {
    $project: {
      _id: 0,
      ip: 1,
      CVE: 1,
      sum_distinct_cve: {
        $size: "$CVE"
       }
      }
     }
    ])
    

    Explained:

    1. Match the ip’s
    2. unwind the CVE arrays
    3. group so you can join ip and CVE distinct values only
    4. Project the necessary fields and use $size to count the distinct CVE’s

    Playground

    Login or Signup to reply.
  2. I agree with @R2D2. One more option to avoid $unwind (which considered costly in terms of performance) is to use $reduce instead:

    db.collection.aggregate([
      {$match: {ip: {$in: ["1.2.3.4", "5.6.7.8"]}}},
      {$group: {
          _id: 0,
          ip: {$addToSet: "$ip"},
          CVE: {$addToSet: "$CVE"}
      }},
      {$project: {
          _id: 0, ip: 1,
          CVE: {
            $reduce: {
              input: "$CVE",
              initialValue: [],
              in: {$setUnion: ["$$value", "$$this"]}
            }
          }
      }},
      {$set: {sum_distinct_cve: {$size: "$CVE"}}}
    ])
    

    See how it works on the playground example

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search