I’ve 2 files: changes.csv and sample.json. My csv file is like following:
header "a", "b"
"a11","b1"
"a22","b2"
"a33","b3"
and the json file is like:
[
{"a":"a1","b":"b1"},
{"a":"a2","b":"b2"},
{"a":"a3","b":"b3"},
{"a":"a4","b":"b4"}
]
I need to write a jq command, which make changes in json file using the csv file, i.e., the final output of the json file should be like following:
[
{"a":"a11","b":"b1"},
{"a":"a22","b":"b2"},
{"a":"a33","b":"b3"},
{"a":"a4","b":"b4"}
]
I wrote the following command:
while IFS=",", read f1 f2
do
jq --argjson k1 $f1 --argjson k2 $f2 '(.[] | select(.b == $k2) | .a) |= $k1' sample.json| sponge sample.json
done < changes.csv
Although, for each iteration it is able to filter and update the value of key "a", but when I try to sponge the results into the json file, it is unable to do so. Don’t know where exactly I am missing out.
3
Answers
So I just figured it out that if I transform my csv file values into input readable format and use --arg instead of --argjson, the file gets update fine. So, the command
awk '{gsub(/"/,"")};1' b.csv > c.csv
will convert the csv file into:So, now if apply the command:
the desired changes will take place.
The plausible reason why it failed in older format because when we read values using "read" command, it transforms
"a"
to""a""
. So, each time I append the answer, it returns void answer.The answer works with any type of json format. For example, if json object is like:
we can update the command to
'(.[] | select(.b == $k2) | .d.a) |= $k1'
.I didn't try the other methods posted, but I appreciate for the help. Thanks folks (@peak & @Fravadona)
Assuming the CSV is reasonably well-behaved, you could write:
For more messy CSV, you will probably want to use a CSV-to-JSON or CSV-to-TSV tool (which can both quite easily be written in jq — see e.g. https://rosettacode.org/wiki/Convert_CSV_records_to_TSV#jq)
If you prefer not to use the –argfile option, then by all means use some other method of reading the two files, e.g. you could use –rawfile for the CSV, leaving STDIN for the JSON.
For a pure
jq
solution, you’ll better make sure that your CSV doesn’t contain any,
or"
orn
in any field or else the code will become a lot more complex.Hence, I propose a solution with Miller (available here for several OSs), which can do the task easily and robustly:
Let’s decompose the command:
will join file1 and file2 on the field
b
. When an other field thanb
exists in both files (for ex.a
) then it is the value of file2 that is outputted.So, for updating the JSON with the values in the CSV, file1 shall be the JSON and file2 the CSV.
--ul
means to output the unjoinable lines of file1; without it, the output will only contain the records that have been "paired".With the
join
verb, Miller allows to specify a different input format for file1 than the one of file2; you set the default input format as--icsv
(for handling file2) and override it with--ijson
after thejoin
verb (for handling file1).The output format is set to JSON with
--ojson
. The--no-jvstack
means to output one "record" per line.