skip to Main Content

a potentially huge json-lines file with objects of known structure is to be converted to csv with headers.

example

{"name":"name_0","value_a":"value_a_0","value_b":"val_b_0"}
{"name":"name_1","value_a":"value_a_1","value_b":"val_b_1"}
{"name":"name_2","value_a":"value_a_2","value_b":"val_b_2"}
{"name":"name_3","value_a":"value_a_3","value_b":"val_b_3"}
{"name":"name_4","value_a":"value_a_4","value_b":"val_b_4"}

expected output

"name","value_a","value_b"
"name_0","value_a_0","val_b_0"
"name_1","value_a_1","val_b_1"
"name_2","value_a_2","val_b_2"
"name_3","value_a_3","val_b_3"
"name_4","value_a_4","val_b_4"

currently tried

(if (input_line_number == 1 ) then ([.|to_entries|.[].key]|@csv) else empty end), 
(.|to_entries|[.[].value]|@csv  )

However this relies on the order in the json
as an alternative I have substituted it with directly selecting the values in the order I want.

(if (input_line_number == 1 ) then (""name","value_a","value_b"") else empty end), (.|[.name?,.value_a?,.value_b?]|@csv  ) 

jqplay

any better solution? especially regarding the if, as it feels bulky.

I mainly don’t want to use slurp because it will resort to load the whole file into memory

2

Answers


  1. Don’t overthink it; add a fixed header and use inputs together with -n/--null-input to format the actual content:

    jq -n '["name", "value_a", "value_b"],
    (inputs | [.name?, .value_a?, .value_b?])
    | @csv' input.json
    

    Output:

    "name","value_a","value_b"
    "name_0","value_a_0","val_b_0"
    "name_1","value_a_1","val_b_1"
    "name_2","value_a_2","val_b_2"
    "name_3","value_a_3","val_b_3"
    "name_4","value_a_4","val_b_4"
    
    Login or Signup to reply.
  2. it’s not jq, but I add it because I think it’s interesting to know it.

    Using Miller and run

    mlr --j2c cat input.jsonl >output.csv
    

    you get

    name,value_a,value_b
    name_0,value_a_0,val_b_0
    name_1,value_a_1,val_b_1
    name_2,value_a_2,val_b_2
    name_3,value_a_3,val_b_3
    name_4,value_a_4,val_b_4
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search