skip to Main Content

I have a list of movies like this:

[
   {
      "title":"X",
      "genres":[
         {
            "tag":"Horror"
         },
         {
            "tag":"Thriller"
         },
         {
            "tag":"Mystery"
         }
      ]
   },
   {
      "title":"Zero Dark Thirty",
      "genres":[
         {
            "tag":"Thriller"
         },
         {
            "tag":"Drama"
         },
         {
            "tag":"History"
         },
         {
            "tag":"War"
         }
      ]
   }
]

I want to query all unique genres and count the number of movies, where the output looks like this:

{
   "Horror":1,
   "Thriller":2,
   "Mystery":1,
   "Drama":1,
   "History":1,
   "War":1
}

Is this possible with jq?

2

Answers


  1. Yes, it is.

    1. Extract all genres into array
    2. Group genres
    3. Map to a key-value pair (key = any element of the group, we’ll take the first; value = count of elements in the group)
    4. Build object from key-value pairs
    map(.genres[].tag)
    | group_by(.)
    | map({ key:first, value:length })
    | from_entries
    

    Output:

    {
      "Drama": 1,
      "History": 1,
      "Horror": 1,
      "Mystery": 1,
      "Thriller": 2,
      "War": 1
    }
    

    Alternatively, use a reduce based approach and simply increase a counter:

    reduce .[].genres[].tag as $genre ({}; .[$genre] += 1)
    

    This is likely more efficient than building an array and grouping.

    Login or Signup to reply.
  2. There is a generic "bag of words" function that makes it easy to solve this task efficiently:

    def bow(stream): 
      reduce stream as $word ({}; .[($word|tostring)] += 1);
    

    With this arrow in your quiver, there are many ways to solve the specific problem at hand. Here’s one that assumes all .tag values are relevant:

    bow(.. | objects | .tag // empty)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search