skip to Main Content

I’m creating a serverless application in AWS that needs to store some data gathered trough an external API into a DynamoDB instance.

I’m trying to achieve it through Lambda functions, but the data are quite heavy, so everytime I try to perfom some data manipulation on them I get "timeout error", even if my timeout time is set to some minutes (if I run the same script on my computer it takes less than 10s to be executed).
I tried to gather data and process them with json.loads() only but I still get timeout error.

Looking around the internet I saw there are quite few methods to pull data from an external API endpoint in AWS, like Glue or AppFlow.

My questions are:

  1. Is it a good choice using Lambdas for this type of task?
  2. What could cause my "timeout" problems?
  3. Do you suggest better alternatives to accomplish this task?

Thank you in advance

2

Answers


  1. My questions are:

    Is it a good choice using Lambdas for this type of task?

    Lambda has a maximum timeout of 15 minutes. If your work could take more than 15 mins then you can try a couple of things:

    1. Increase your Lambda memory
    2. Split the work across multiple Lambdas

    What could cause my "timeout" problems?

    Make sure the task isn’t stuck in a loop, use adequate logging. Increase Lambda memory, Lambda should be able to match your local machine time.

    Do you suggest better alternatives to accomplish this task?

    It depends, if it’s a large amount of data (GB+) then I would suggest using AWS Glue.

    Login or Signup to reply.
  2. In addition to the points mentioned by Leeroy Hannigan, with respect to question 3:

    You can see whether you can split the workload and leverage AWS Stepfunctions.

    E.g. you can use a different set of Lambda functions to split, iterate, transform, and then load the data.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search