skip to Main Content

I’m a little bit desperate to automate the AVRO schema registration for AWS Glue Schema Registry.

According to the official documentation (https://docs.aws.amazon.com/glue/latest/dg/schema-registry-gs.html#schema-registry-gs4) to register a new AVRO schema following command must be executed:

aws glue create-schema --registry-id RegistryName="my-registry-name" --schema-name testschema --compatibility BACKWARD --data-format AVRO --schema-definition "{"type":"record","name":"r1","fields":[{"name":"f1","type":"int"},{"name":"f2","type":"string"}]}"

This example also works pretty well. But now I want to automate this process for other schemas and for this purpose I’m storing this schema definition in a shell variable by doing the following:

current_schema=$(testschema.avro | jq -c | jq -R)

So here I have exactly the same Avro schema but this time it is in the valid *.avro file in a valid JSON format. So I’m converting it into a one liner, escaping the JSON and store the output into a $current_schema variable.
When I’m trying to echo this variable I see exactly the same escaped JSON like provided in the official documentation:

echo $current_schema
"{"type":"record","name":"r1","fields":[{"name":"f1","type":"int"},{"name":"f2","type":"string"}]}"

But the magic starts when I’m trying to use $current_schema as a very last parameter in the aws command:

aws glue create-schema --registry-id RegistryName="my-registry-name" --schema-name testschema --compatibility BACKWARD --data-format AVRO --schema-definition $current_schema

I see following error output:

An error occurred (InvalidInputException) when calling the CreateSchema operation: Schema definition of AVRO data format is invalid: Illegal initial character: {"type":"record","name":"r1","fields":[{"name":"f1","type":"int"},{"name":"f2","type":"string"}]}

Does anyone have any idea what’s going on? How can I solve this problem? It must be possible to read random *.avro schema file, escape JSON and provide that escaped JSON as a –schema-definition parameter value. Unfortunately it works only when I put escaped JSON of the scheme "as is".

Thank you!

2

Answers


  1. Chosen as BEST ANSWER

    OK... The problem is solved. It turns out that the JSON representation of the schema must not be escaped at all. Another one example of undocumented feature of aws.

    So this works perfectly:

    current_schema=$(cat SchemaFile.avro)
    aws glue create-schema --registry-id RegistryName="my-registry-name" --schema-name testschema --compatibility BACKWARD --data-format AVRO --schema-definition $current_schema
    

  2. If you are using (non-windows) shell, this should work:

    current_schema=$(cat testschema.avro | jq -c)
    aws glue create-schema --registry-id RegistryName="my-registry-name" --schema-name testschema --compatibility BACKWARD --data-format AVRO --schema-definition "$current_schema"
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search