I’m not really sure on how to phrase this question so lemme give an example: I have two types of JSON document formats for a large amount of files. Most of the contents apart from one object is irrelevant to me. I want to create a normalised version of each file. These are the two objects I care about (in each of the formats):
{
"title": "Some data",
"data": [
{
"id": "123",
...
},
{
"id": "abc",
...
}
]
}
and
{
"title": "Some more data",
"data": [
{
"ids": [
{
"id": "123",
...
},
{
"id": "abc",
...
}
],
"names": [
{
"name": "A",
...
},
{
"name": "B",
...
}
]
}
]
}
Each of those "object formats" is an object inside a JSON array in a file. I want to convert each of the files I have into a list of objects that captures the title
, list of id
and list of name
in a single object:
{
"title": "Some more data",
"ids": [
"123",
"abc"
],
"names": [
"A",
"B"
]
}
I use the following jq
, but it doesn’t work (it creates multiple objects with the same title per name
or id
:
for f in $(find * -wholename "*.json" | sort); do
cat $f | jq '..
| if type == "object" then
if has("data") then {
"name": .title,
"ids": (.data[] | [
if has("id") then {
"id": .id
} else if has("ids") then {
"ids": .ids[],
"names": .names?
} else null end
end
])} else null end
else null end
| select(type != "null")' > "$f" ; done
2
Answers
You could iterate over the outer array using
.[]
, then construct the objects using? //
to provide alternatives if one evaluates tonull
.If you are okay with
null
s in the comlpete absence of a key (as with.name
in your first format), try this:Demo
But you could also filter out
null
s usingvalues
:Demo
If you want to get rid of keys with empty arrays altogether, filter them out using
map_values
on a comparison usingselect
:Demo
Edit using the modified input files: As the deeper levels use the same (relative) path (here
.specs[].spec
), we need some other distinction criteria to rule out the level with "Some title you don’t care about". Checking for the presence of a.data
key seems to fit with the new sample data.Demo
If you are okay with having
names: null
ornames: []
in the final document for your first example, the following looks like a simple solution:or equivalent:
Output 1:
Output 2: