skip to Main Content

I have multiple json files in subfolders, each containing an array of objects with same keys, different values.
example: file1.json:

[
    {
        "id": 1,
        "name": "John"
    },
    {
        "id": 2,
        "name": "Mary"
    }
]

file2.json:

[
    {
        "id": 1,
        "name": "mulberry"
    },
    {
        "id": 2,
        "name": "strawberry"
    }
]

I want to globally increment the value of "id" on all files without resetting the number between files.

What I have here works but it only increments each file locally and resets the counter on next file. What I want is to increment id for all files, so say if one combines two separate files, the id fields will remain unique and numeric.

Expected result:

file1.json:

[
    {
        "id": 1,
        "name": "John"
    },
    {
        "id": 2,
        "name": "Mary"
    }
]

file2.json:

[
    {
        "id": 3,
        "name": "mulberry"
    },
    {
        "id": 4,
        "name": "strawberry"
    }
]

Here is what I have:

#!/bin/bash
i=0
grep -rl --exclude=${0:2} . -e '"id":' | while read -r file; do
    echo "nEditing: $filen"
    jq 'to_entries | map(.value["id"] = '"$i"') | map(.value)' $file
    ((i++))
done

2

Answers


  1. Chosen as BEST ANSWER

    This solution from a colleague (also redirects output from jq to temp file before overwriting original file) works: jq --argjson id $start_id '.[] |= . + {"id": ($id + (.id-1))}' "$file" > tmp.json && mv tmp.json "$file"

    Final code hardcodes the length of array to avoid the mess @peak mentioned at the end of his proposal:

    #!/bin/bash
    start_id=1
    find . -type f -name '*.json' -print0 | while IFS= read -r -d $'' file; do
        echo "Updating $file"
        jq --argjson id $start_id '.[] |= . + {"id": ($id + (.id-1))}' "$file" > tmp.json && mv tmp.json "$file"
        echo "Updated $file with start_id=$start_id"
        start_id=$((start_id + 10))
    done
    

  2. The following avoids having to call jq more than once, but results in the files being populated with "compact" JSON:

    jq -nrc '
       def resetId($start):
         . as $in | [range(0; length) as $i | $in[$i] | .id = $start + $i];
       foreach inputs as $file ({n:1};
         .n as $n
         | .emit = [input_filename, ($file | resetId($n))]
         | .n += ($file|length) )
      | .emit[] ' file1.json file2.json | 
      awk ' NR%2==1 { fn=$1; next} {print $1 > fn}'
    

    Note that this will overwrite the files, so you might want to tweak the last line to avoid that.

    If you want to use the above approach but have the JSON pretty-printed in each output file, you could call jq on each resultant file. This requires fewer calls to jq than the obvious alternative, namely for each file, calling jq to determine the array length, and then again to perform the update.

    Yet another (rather messy) option would be to call jq once to determine the array sizes, and then for each file calling jq again.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search