I am trying to parse the JSON from aws ecr describe-images
, and return a result with a specific image tag, matching a pattern, from the list. In particular I get an array with many of these entries:
{
"imageDetails": [
{
"registryId": "997652729005",
"repositoryName": "events",
"imageDigest": "sha256:5b649219a3abc5e903b27fd947f375df8634c883432a69e40d245ac2393d67b2",
"imageTags": [
"events-test-build-340"
],
"imageSizeInBytes": 314454408,
"imagePushedAt": "2021-01-12T10:42:51-05:00",
"imageScanStatus": {
"status": "COMPLETE",
"description": "The scan was completed successfully."
},
"imageScanFindingsSummary": {
"imageScanCompletedAt": "2021-01-12T10:43:00-05:00",
"vulnerabilitySourceUpdatedAt": "2021-01-12T04:45:25-05:00",
"findingSeverityCounts": {}
},
"imageManifestMediaType": "application/vnd.docker.distribution.manifest.v2+json",
"artifactMediaType": "application/vnd.docker.container.image.v1+json"
},
{
"registryId": "997652729005",
"repositoryName": "events",
"imageDigest": "sha256:0fae259bcfe02c8cf0ec3746aae668b3166960e7119467496df9aedfbc2c8c5b",
"imageTags": [
"6debaabc26cc82a4011ea9c71854cebac7a57250-433",
"6debaabc26cc82a4011ea9c71854cebac7a57250",
"6debaabc26cc82a4011ea9c71854cebac7a57250-433-dev",
"events-prod-build-433"
],
"imageSizeInBytes": 316110570,
"imagePushedAt": "2020-12-21T03:11:52-05:00",
"imageScanStatus": {
"status": "COMPLETE",
"description": "The scan was completed successfully."
},
"imageScanFindingsSummary": {
"imageScanCompletedAt": "2020-12-21T03:12:02-05:00",
"vulnerabilitySourceUpdatedAt": "2020-11-03T20:21:09-05:00",
"findingSeverityCounts": {}
},
"imageManifestMediaType": "application/vnd.docker.distribution.manifest.v2+json",
"artifactMediaType": "application/vnd.docker.container.image.v1+json"
}
]
}
I would like the output to be something like this:
{
"tag": [
"6debaabc26cc82a4011ea9c71854cebac7a57250"
],
"sha": "sha256:5b649219a3abc5e903b27fd947f375df8634c883432a69e40d245ac2393d67b2",
"imagePushedAt": "2021-01-12T10:42:51-05:00"
}
The challenge is to pick the images that have a tag whose name includes *prod-build*
(a deployed production build), but then return a the tag having no dashes in it, which is the tag we actually use. (Yes, this is entirely defective, I know).
I have gotten pretty far:
cat ecr-describe-images-events.json
| jq '.imageDetails[]
| {tag: .imageTags, sha: .imageDigest, date_pushed: .imagePushedAt}
| select( .tag | contains(["prod-build"]))
.tag[] |= walk(
if type=="string" then
select(
match("^[^-]+$")
)
else
null
end
)'
So, from the imageDetails
array, get and name elements, then from the array of tags select the nodes that have a tag with the string prod-build
. From these nodes, find the
tags
array element whose name does not include dashes, and return that.
The last part, which I have done with select
, walk
, and match
is behaving differently than I expect. I am getting:
{
"tag": [
"events-test-build-340"
],
"sha": "sha256:5b649219a3abc5e903b27fd947f375df8634c883432a69e40d245ac2393d67b2",
"date_pushed": "2021-01-12T10:42:51-05:00"
}
{
"tag": [
"6debaabc26cc82a4011ea9c71854cebac7a57250",
"events-prod-build-433",
null,
null
],
"sha": "sha256:8638389b7d83869b17b1c74ff30740d7cf8eff4574100c1270f20d4686252552",
"date_pushed": "2021-02-17T13:11:42-05:00"
}
If I don’t include the last part, starting walk(...)
I get the correct nodes. But when I do use walk
with match or test, I get back array elements that don’t match my regexp.
I am not fixed on my approach, or on the output format: I just need the three fields in some structure. What have I failed to understand?
3
Answers
It seems you want s.t. like:
You will have to tweak this if you want to avoid selecting any objects for which .tag would otherwise be the empty array.
Here’s a variant using
contains
for both checks, with the second one negated usingnot
:Demo
You can use
del
to delete all tags that match a certain criterion:First selects those objects with a "prod-build" tag and then deletes all other tags from the list.
Output: