skip to Main Content

Problem description

How can I create an item to send to DynamoDB from a JSON file, and have the type in DynamoDB end up as a StringSet as defined in AWS documentation using Python?

I am using the Boto3 put_item API.

Approaches attempted

I’ve found that if I define the JSON object as a JSON list it gets converted to a List of Objects, which is not what I am going for here – I know all of the types are the same, and I want them to be treated as a Set of one type of Objects. I don’t think JSON has an explicit way to define a Set, but please let me know if there is one.

If I define the item within my Python script itself, and make the object a Python Set explicitly the data will not be loaded through the put_item call.

I did find some reference to using the JS Document Client to create a StringSet, but I haven’t seen something similar for Python.

I’ve seen some references to putting the literal values like "SS" or "M" as the keys, with the object as values for those keys, but I’ve tried that in a variety of ways and haven’t found it makes it through the put_item call at all. If that’s a way to do this I’m fine with that as well.

Example JSON

{
  "mykey": 
  {
    "secondkey": ["val1", "val2", "val3"]
  }
}

Motivation:

Having the data stored as a StringSet makes it easier to read the objects out in a Step Function further down the line. Instead of needing to Flow Map through the items in the list, I can instead hand them directly to something else that needs them. Alternatives to this approach are also welcome, especially if this doesn’t turn out to be possible.

2

Answers


  1. Chosen as BEST ANSWER

    The previous answer from Lee fulfilled the requirement of sending a Set to DynamoDB using Python.

    In order to read from a JSON file and still create a Set I needed to add metadata to the JSON file for which fields should be converted to Sets, as outlined in this previous answer on StackExchange.

    For my particular use case I had a Lambda reading the JSON file and sending it to DynamoDB, so I added my extra processing within the Lambda.

    Given my starting JSON:

    {
      "mykey": 
      {
        "secondkey": ["val1", "val2", "val3"]
      }
    }
    

    I added extra metadata:

    {
      "mykey": 
      {
        "set_fields": ["secondkey"],
        "values": {
            "secondkey": ["val1", "val2", "val3"]
        }
      }
    }
    

    Now my Lambda will read the data at mykey, read out the set_fields list, read out the values dict, and apply a Set function to whichever fields are called out within set_fields.

    def add_sets_to_data(data: dict) -> dict:
        output_data = data["values"]
        set_fields = data["set_fields"]
    
        for set_field in set_fields:
            output_data[set_field] = set(output_data[set_field])
    
        return output_data
    

    When the resulting dictionary is loaded to DynamoDB it correctly creates the set_fields fields as StringSets instead of List of Maps.


  2. Assuming you use the resource Table level client:.

    import boto3
    
    res = boto3.resource('dynamodb', region_name=region)
    
    table = res.Table("my-table")
    
    item = {
      "pk": "1234",
      "mykey": 
      {
        "secondkey": set(["val1", "val2", "val3"])
      }
    }
    
    
    table.put_item(Item=item)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search