Imagine we have two json objects:
{"messages":["one"], "keyA": "valueA"}
and
{"messages":["two"], "keyB": "valueB"}
I expect there’s a way to merge these two objects while concatenating the array values, such that the resulting object would be:
{"messages":["one","two"], "keyA": "valueA", "keyB": "valueB"}
Most of the approaches I’ve seen thus far to do this are inadequate in that the array gets overwritten by the "right most" object’s version.
i.e.:
echo '{"messages":["one"], "keyA": "valueA"}{"messages":["two"], "keyB": "valueB"}' | jq -s '.[0] * .[1]'
produces:
"messages": [
"two"
],
"keyA": "valueA",
"keyB": "valueB"
}
(NOTE: the messages
array value only contains two
(from the second (right-most) object)
From the jq manual on the "add" command:
Objects are added by merging, that is, inserting all the key-value pairs from both objects into a single combined object. If both objects contain a value for the same key, the object on the right of the + wins. (For recursive merge use the * operator.)
(emphasis added)
But changing the +
operator to *
does not appear to change the output.
I’ve seen jq: recursively merge objects and concatenate arrays but… wow… is there no better way?
Bonus points if the solution can handle an object whose array key’s value is null
as if it were an empty array:
{"messages":null, "keyA": "valueA"}{"messages":["two"], "keyB": "valueB"}
2
Answers
There's a couple approaches I found:
First, this is a known issue
Second, from the known issues I found this:
jq -s '[.[] | to_entries] | flatten | reduce .[] as $dot ({}; .[$dot.key] += $dot.value)'
e.g.:
echo '{"messages":["one"], "keyA": "valueA"}{"messages":["two"], "keyB": "valueB"}' | jq -s '[.[] | to_entries] | flatten | reduce .[] as $dot ({}; .[$dot.key] += $dot.value)'
produces:
(as desired)
This is from https://github.com/stedolan/jq/issues/502
Additionally, inspired by https://github.com/stedolan/jq/issues/957 I was able to do this:
echo '{"messages":["one"], "keyA": "valueA"}{"messages":["two"], "keyB": "valueB"}' | jq -s '.[2].messages = .[0].messages + .[1].messages | .[0] + .[1] + .[2]'
Which also produces the expected output, but in a non-generalized way. The inspiration to store the desired merged array in the third element of the root array is pretty neat (which is why I mention it); then use the existing "right side wins" behavior to insert the properly merged array back into the result.
Lastly, from that same issue, there's this:
echo '{"messages":["one"], "keyA": "valueA"}{"messages":["two"], "keyB": "valueB"}' | jq -s '.[0] as $o1 | .[1] as $o2 | ($o1 + $o2) | .messages = ($o1.messages + $o2.messages)'
which is essentially the same non-generalized solution (but perhaps more elegant?) than the temp-storage approach above.
For simple examples such as in the question, the following filter
provides a simple but principled approach that also qualifies for the “bonus points”:
Of course, this is not commutative.
For more than two objects, simply use
reduce
, e.g. for an array of objects: