skip to Main Content

I have created a lambda function, that is extracting the audio stream from a video file using ffmpeg. I have also configured API gateway as a trigger, where I am passing the file to the lambda function in the request body.

The lambda function is working perfectly well with small files, but with bigger files, it needs a bit more time and then I am running into the API gateway timeout, which according to my understanding is set to 29 seconds max.

So when I trigger audio extraction from a bigger file, I am hitting this timeout and my API request fails to return any result even though the transcoding still runs in the background and the file is extracted, so I was wondering what is the best approach to handle those cases, where the execution of the lambda function is taking longer?

I was thinking to start the transcoding in the background and simply return a JSON with a message that the transcoding might take a couple of minutes, depending on the input file duration, but if I try to push the ffmpeg to the background I am being presented with an error, that the destination file doesn’t exist.

os.system(f"{ffmpeg} -loglevel panic -nostdin -i {in_video} -vn -c:a aac -ar 48000 -b:a 192K {out_audio} 2> /dev/null &")

This is the ffmpeg command extracting the audio and transcoding it to AAC.

If I remove the 2> /dev/null & part of the command, it runs just fine, but if I keep it, I get an error:

"errorMessage": "[Errno 2] No such file or directory: ‘output_audio.aac’"

"errorType": "FileNotFoundError"

So I was wondering what is the preferred way to run processes in the background.

2

Answers


  1. There are many options that can be considered.
    But first, since you already have all the flow working with lambda behind API Gateway, you can use lambda url.
    Lambda url are a good way to trigger lambda via HTTPS. It supports multiple authorization mechanism such as IAM.

    The interesting point is about timeout. When using Lambda url, the maximum timeout you can have is 15 mins, which is definitely better than the 29s you have when dealing with API Gateway.

    Lambda url is free of charge and can be enabled on existing lambda function.

    Increasing the timeout might just push back the problem until you have a very big file to convert and in the long run, maybe worth exploring other solution like uploading the file to S3 and maybe use AWS Batch or Spin up an EC2 to process the file. This would require more architecture design and implementation though.

    Login or Signup to reply.
  2. For longer processing, it is recommended to use asynchronous invocations, where the Lambda function is triggered and runs until completion and does not block the caller. One option to solve it would be to upload the file to S3, configure the Lambda function to react to the S3 event, download the file from S3, process it, and upload it to another S3 bucket after processing completes.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search