skip to Main Content

I have data from the Twitter API and I want to retrieve a list of all cities if the countryCode is equal to "AU".

{ 
"gnip" : {
    "matching_rules" : [
        {
            "value" : "bio_location: "Australia"", 
            "tag" : null
        }, 
        {
            "value" : "bio_location: "Canberra"", 
            "tag" : null
        }
    ], 
    "klout_score" : 40, 
    "language" : {
        "value" : "en"
    }, 
    "profileLocations" : [
        {
            "objectType" : "place", 
            "geo" : {
                "type" : "point", 
                "coordinates" : [
                    149.12807, 
                    -35.28346
                ]
            }, 
            "address" : {
                "country" : "Australia", 
                "countryCode" : "AU", 
                "locality" : "Canberra", 
                "region" : "Australian Capital Territory"
            }, 
            "displayName" : "Canberra, Australian Capital Territory, Australia"
        }
    ]
}, 

}

It is important to note here that the distinct() method is not sufficient here since I do not want to have the distinct city names, but each city for each tweet if it is located in Australia.

the output should look like this:
[
Sydney,
Melbourne,
Cranberra
]

2

Answers


  1. db.collection.aggregate([
      {
        $match: {
          "gnip.profileLocations.address.countryCode": "AU"
        }
      },
      {
        $unwind: "$gnip.profileLocations"
      },
      {
        $match: {
          "gnip.profileLocations.address.countryCode": "AU"
        }
      },
      {
        $group: {
          _id: null,
          results: {
            $addToSet: "$gnip.profileLocations.address.locality"
          }
        }
      }
    ])
    

    mongoplayground

    Login or Signup to reply.
  2. The following aggregation query returns the distinct city names as aggregation result.

    db.collection.aggregate([
    { 
        $project: {
            _id: 0, 
            result: {
                $let: {
                    vars: {
                        filtered: { 
                            $filter: { 
                                 input: "$gnip.profileLocations", 
                                 cond: { 
                                     $eq: [ "$$this.address.countryCode", "AU" ] 
                                 }
                            }
                        }
                    },
                    in: {
                        $setIntersection: [ 
                            { 
                                $map: { 
                                     input: "$$filtered", 
                                     in: "$$this.address.locality"
                                }
                            }
                        ]
                    }
                }
            }
        }
    }
    ])
    

    The $filter aggregation operator is used to select the array data with country code as "AU", and the $map is used for getting just the address locality values. The $setIntersection is to have the locality names distinct.

    Note the aggregation output is always a cursor with document(s). Example output: { "result" : [ "Canberra", "Sydney" ] }.

    To get just the array from the output, do this:

    var agg_output = db.collection.aggregate( // substitute pipeline here from above qyery).toArray()
    var output = (agg_output.length > 0) ? agg_output[0].result : [ ]  
    // [ "Canberra", "Sydney" ]
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search