I have a csv sitting in an S3 Bucket about 900,000 rows long, and within that csv I have two columns phone
and ttl
.
I am able to successfully import this csv into a new DynamoDB
table, however I am NOT able to classify what type of object each column should be (which in the case of the ttl
column that is classified as a string rather than number).
In the csv file itself the ttl
values are not surrounded by quotation marks–just purely a number that’s being misinterpreted.
2
Answers
When using the import from S3 feature, you can only specify the types of the partition key and sort key attributes. All other attributes default to string.
As a workaround, you could convert the CSV objects into DDB-JSON or Ion. These data types support non-string types and can be used as a source format for an import.
Unfortunately while using CSV files and DynamoDB’s import from S3 feature it imports non-key attributes as string type:
Src
It’s best to convert the data to DDB-JSON first so it maintains its correctness. Depending on how much data you have you can either use
When doing this, it’s useful to leverage some of the SDKs serialisation libraries, such as pythons:
https://boto3.amazonaws.com/v1/documentation/api/latest/_modules/boto3/dynamodb/types.html
The python one is super useful as you can use it in both Lambda and Spark.