I’m trying to export data from a DynamoDB transaction table using Python. Until now I was able to get all the data from the table but I would like to add a filter that allows me to only get the data from a certain date until today.
There is a field called CreatedAt
that indicates the time when the transaction was made, I was thinking of using this field to filter the new data.
This is the code I’ve been using to query the table, it would be really helpful if anyone can tell me how to apply this filter into this script.
import pandas as pd
from boto3.dynamodb.conditions
aws_access_key_id = '*****'
aws_secret_access_key = '*****'
region='****'
dynamodb = boto3.resource(
'dynamodb',
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
region_name=region
)
transactions_table = dynamodb.Table('transactions_table')
result = transactions_table.scan()
items = result['Items']
df_transactions_table = pd.json_normalize(items)
print(df_transactions_table)
Thanks!
2
Answers
Boto3 allows for FilterExpressions as part of a DynamoDB query that will achieve filtering on the field. See here
Optionally using FilterExpressions will still consume the same amount of read capacity units.
You need to use
FilterExpression
which would look like the following:You can learn more from the docs on Boto3 Scan and FilterExpression.
Some advice: Please do not hard code your keys the way you have done in this code, use an IAM role. If you are testing locally, configure the AWS CLI which will provide credentials that you can assume when testing, that way you wont make a mistake and share keys on GitHub etc…