skip to Main Content

Suppose I have this ndjson file:

{"id": 99, "labeled_values": [["one", "green"], ["two", "red"], ["three", "blue"]]}
{"id": 100, "labeled_values": [["four", "green"], ["five", "red"]]}

How do I get the output below (tab-separated)? That is: id on every line, and the pairs of (value, label) flattened.

99 one green
99 two red
99 three blue
100 four green
100 five red

Here are two failed attempts:

$ cat tmp/foo.ndjson | jq -r '.id as $id | [$id, .labeled_values.[0], .labeled_values.[1]] '
[
  99,
  [
    "one",
    "green"
  ],
  [
    "two",
    "red"
  ]
]
[
  100,
  [
    "four",
    "green"
  ],
  [
    "five",
    "red"
  ]
]

$ cat tmp/foo.ndjson | jq -r '.id as $id | [$id, .labeled_values[].[0], .labeled_values[].[1]] '
[
  99,
  "one",
  "two",
  "three",
  "green",
  "red",
  "blue"
]
[
  100,
  "four",
  "five",
  "green",
  "red"
]

This question is very related, but I still don’t understand jq well enough.

A related question: how do I learn enough about the processing model of jq to understand how to break down nested structures like this into flat structures?

Most of the examples I find are flat and simple. They pick out single fields, not nested things.

2

Answers


  1. .[] will create an output for each .labeled_values.

    To convert .id to a string and add a t use a string interpolation.

    Use @tsv to convert the array to tsv

    "(.id)t(.labeled_values[] | @tsv)"
    
    Try it online
    Login or Signup to reply.
  2. Or simply:

      [.id] + .labeled_values[] | @tsv
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search