skip to Main Content

This is a variation on a question that’s been asked before.

I’m using an external data source in Terraform to ask it for a list of volume snapshots in AWS Dublin, and JQ in a templatefile to extract the snapshot ids.

data "external" "volsnapshot_ids" {

  program    = [

    "bash",

    "-c",

    templatefile("cli.tftpl", {input_string = "aws ec2 describe-snapshots --region=eu-west-1", top = "Snapshots", next = "| .SnapshotId"})]

}

And it uses this templatefile:

#!/bin/bash

set -e

OUTPUT=$(${input_string} | jq  -r -c '.${top}[] ${next}' | jq -R -s -c 'split("n")' | jq '.[:-1]')

jq -n -c --arg output "$OUTPUT" '{"output":$output}'

The basic CLI command with JQ works and looks like this:

aws ec2 describe-snapshots --region=eu-west-1 | jq  -r -c '.Snapshots[] | .SnapshotId' | jq -R -s -c 'split("n")' | jq '.[:-1]' | wc -l

It returns a lot of snapshot ids.

When I run it through Terraform though, it errors:

Error: External Program Execution Failed

│ 

│   with data.external.volsnapshot_ids,

│   on data.tf line 304, in data "external" "volsnapshot_ids":

│  304:   program    = [

│  305:     "bash",

│  306:     "-c", 

│  307:     templatefile("cli.tftpl", {input_string = "aws ec2 describe-snapshots --region=eu-west-1", top = "Snapshots", next = "| .SnapshotId"})]

│ 

│ The data source received an unexpected error while attempting to execute

│ the program.

│ 

│ Program: /bin/bash

│ Error Message: bash: line 6: /usr/local/bin/jq: Argument list too long

│ 

│ State: exit status 1

I think it’s the size of the dataset being returned because it works in regions with less snapshot ids – London works.

Sizewise, here’s London:

aws ec2 describe-snapshots --region=eu-west-2 | jq  -r -c '.Snapshots[] | .SnapshotId' | jq -R -s -c 'split("n")' | jq '.[:-1]' | wc -l
20000

And here’s Dublin:

aws ec2 describe-snapshots --region=eu-west-1 | jq  -r -c '.Snapshots[] | .SnapshotId' | jq -R -s -c 'split("n")' | jq '.[:-1]' | wc -l
42500

Is there a way to fix up the JQ in my templatefile so it can handle big JSON files?

2

Answers


  1. I wouldn’t recommend using command inside TF datasource. Might be hard to debug. There is a data_source for EBS snapshots.

    As for your command inside template, in order to debug it you need to simulate the same environment. E.g. instead of running as is, try to repeat what you have in template, like bash -c and so on. Also you can add output to see the template rendered to see if there are any issues.

    Login or Signup to reply.
  2. Scroll to bottom of answer.


    Don’t provide the value as argument, but via directly standard input:

    aws ... 
      | jq -rc '.${top}[] ${next}' 
      | jq -Rsc './"n"' 
      | jq -c '.[:-1]'
      | jq -Rc '{output:.}'
    

    Note that you are can probably combine most of the separate jq invocations into a single jq program.


    This pipeline of jq invocations is a massively, massively overcomplicated non-solution. Why convert back and forth between strings and JSON objects, parsing those strings again, when jq can already process the data directly?

    aws ... | jq -c '{ output: .Snapshots | map(.SnapshotId) | tostring }'
    

    Example output:

    {"output":"["snap-cafebabe","snap-deadbeef","snap-0123abcd"]"}
    

    If you have to use variables:

    top=Snapshots
    next=SnapshotId
    aws ... | jq --arg top "$top" --arg next "$next" -c '{ output: .[$top] | map(.[$next]) | tostring }'
    

    or .[$top] | map(.[$next]) | tostring | { output: . } or .[$top] | map(.[$next]) | { output: tostring }.

    Even if you want or need to string together multiple jq invocations, there’s little sense in consuming raw input (-R) and try to parse it, if you already have perfectly structured JSON items in stream form.

    Here is what it would look like if you wanted to do it with multiple steps, but always stay in JSON land (and not play ping pong between structured JSON and unstructured text):

    top=Snapshots
    next=SnapshotId
    aws ... 
      | jq --arg top "$top" --arg next "$next" '.[$top][][$next]' 
      | jq -sc '{ output: tostring }'
    

    or the equivalent:

    top=Snapshots
    next=SnapshotId
    aws ... 
      | jq --arg top "$top" --arg next "$next" '.[$top] | map(.[$next])' 
      | jq -c '{ output: tostring }'
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search