I have a DDB stream with lambda handler attached.
By stats: lambda has ~500ms max duration (probably cold starts), ~50ms average duration, unlimited concurrency, maximum ~20 concurrent executions, no errors or throttling.
But somehow I have spikes of iterator age 30-60 sec.
As I understand, iterator age is a delay between record ingestion and lambda processing.
It supposed to be near real time system, with delays of ~1sec, but not 60sec.
In lambda I log records where ApproximateCreationDateTime > 30 sec, and I see some records logged but it doesn’t help much.
These slownesses don’t happen very often: around 10 times per hour.
On screenshot occasions of records older than 30 sec, found by lambda logs, during 3 hours.
Could you please help me to understand, what is the root cause of such delays and what can I do to mitigate that.
2
Answers
DynamoDB stream, just like Kinesis stream, has partitions/shards. Your concurrency is limited not only by lambda limits but also by number of partitions. In case of DynamoDB streams, number of partitions matches the number of partitions in your DynamoDB table. Few options to try:
My guess here is that this only happens infrequently, and the majority of the time you have a low iterator age.
That’s due to how DynamoDB streams works under the hood, to ensure that no shard runs hot, the shards are periodically split, every 4 hours. This means that each shard on your stream will have a small spike every 4 hours (not all at once), and this causes Lambda to do a shard discovery to obtain the new shard information. This shard discovery can take a couple of seconds, and thus you see a spike in your iterator age.