I am currently working on a stream of data that comes from an IoT rule and inputs the readings to a dynamoDB table.
From that I need to rearrange the data to make predictions. For every new data entry, every new row I trigger a lambda function, but I need to get the last 96 table rows to manipulate.
My problem is
1- how can I query the table
2- and transform to something familiar like a pandas dataframe?
In the table I have a timestamp column (the format is still open to be decided)
2
Answers
Ultimately it depends on how your data is structured and how you want to access it. But to answer your question, when DynamoDB Streams invokes your Lambda you can use a
Query
orScan
to retrieve the data from DynamoDB.My assumption here is that you need to retrieve the last 96 updates based on timestamp,
Scan
will not allow you to do that efficiently. You would need to useQuery
however, that will dictate your data model.Depending on your specific use-case needs, I would create a Global Secondary Index with a static partition key such as
GSI_PK=1
and your timestamp as the sort keyNow you can
Query
your GSI and be sure you are being returned the last 96 items. GSI’s are eventually consistent so be aware of that.