skip to Main Content

I would like to produce a list of all values from any given JSON. The special requirement is that for values in nested attributes, all the values of parent’s attributes must be repeated in the line. No keys must be printed.

For example, from this input JSON:

{
   "results":[
      {
         "id":306,
         "name":"First Company",
         "branches":[
            {
               "id":4191,
               "city":"Seattle",
               "customers":[
                  {
                     "id":446,
                     "name":"Big Tech 1"
                  },
                  {
                     "id":447,
                     "name":"Big Tech 2"
                  }
               ]
            },
            {
               "id":4192,
               "city":"Oakland",
               "customers":[
                  {
                     "id":448,
                     "name":"Health Tech 1"
                  },
                  {
                     "id":449,
                     "name":"Health Tech 2"
                  }
               ]
            }
         ]
      }
   ]
}

I would like to produce this output (please notice the repeated values are highlighted in red, but the output should be with no colors; no keys should be printed):

enter image description here

The JSON above is just an example and generic JSON with arbitrary depth of nesting of JSON objects must be assumed, which suggests that the processing must be recursive. If null appears they must be printed as well.

2

Answers


  1. Iterate and nest:

    jq -r '
      .results[] | [.id, .name] + (
        .branches[] | [.id, .city] + (
          .customers[] | [.id, .name]
        )
      ) | @tsv
    '
    
    306 First Company   4191    Seattle 446 Big Tech 1
    306 First Company   4191    Seattle 447 Big Tech 2
    306 First Company   4192    Oakland 448 Health Tech 1
    306 First Company   4192    Oakland 449 Health Tech 2
    

    Demo

    To keep the values JSON-encoded (quotes around strings), you could re-convert them using @json

    jq -r '
      .results[] | [.id, .name] + (
        .branches[] | [.id, .city] + (
          .customers[] | [.id, .name]
        )
      ) | map(@json) | @tsv
    '
    
    306 "First Company" 4191    "Seattle"   446 "Big Tech 1"
    306 "First Company" 4191    "Seattle"   447 "Big Tech 2"
    306 "First Company" 4192    "Oakland"   448 "Health Tech 1"
    306 "First Company" 4192    "Oakland"   449 "Health Tech 2"
    

    Demo

    Alternatively (to the JSON encoding), you could use the @csv builtin, which also escapes strings but also separates the items with a comma:

    jq -r '
      .results[] | [.id, .name] + (
        .branches[] | [.id, .city] + (
          .customers[] | [.id, .name]
        )
      ) | @csv
    '
    
    306,"First Company",4191,"Seattle",446,"Big Tech 1"
    306,"First Company",4191,"Seattle",447,"Big Tech 2"
    306,"First Company",4192,"Oakland",448,"Health Tech 1"
    306,"First Company",4192,"Oakland",449,"Health Tech 2"
    

    Demo

    Login or Signup to reply.
  2. The following makes various assumptions but only hardcodes "results", "branches" and "customers",
    and does not assume that the ordering of the keys within the customer objects is uniform.
    Using scalar_keys in the obvious way, the solution can also be readily extended to allow for greater depth, and
    indeed could be generalized to compute the array-valued keys.

    def scalar_keys:
      [ keys_unsorted[] as $k
        | .[$k]
        | scalars
        | $k ];
        
    .results[]
    | [.[scalar_keys[]] ] as $top
    | .branches[]
    | [.[scalar_keys[]] ] as $intermediate
    | .customers
    | (first | keys_unsorted) as $keys
    | .[]
    | [.[$keys[]]]
    

    This produces arrays, so you will want to tack on a call to @tsv or @csv or similar.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search