skip to Main Content

I am using python 3.9 and boto3 to issue the following statements from a lambda function.
The function works 90% of the time. But every now and then it stops working for a few hours.
The lambda function times out.
See Cloud watch log below.
The lambda timeout is set to 20 seconds…when it works the s3 file reads in about three hundredths of a second.
The get_object function doesn’t look like it times out…it just hangs forever.
All the exceptions in the try are me just trying to trouble shoot…The function never prints any of the except prints to the log.

Any Ideas why S3 just drops for hours at a time and why the boto3 get_object never times out?

Thanks for any help!


config = boto3.session.Config(connect_timeout=5, read_timeout=5, retries={'max_attempts': 0})
s3 = boto3.client('s3',config=config,aws_access_key_id='XXXX',aws_secret_access_key='XXXX')
file = GetS3file("index.html")


def GetS3file(filename):
    global responseout
    responsefile=""
    try:
        print("trying s3 get_object:samplebucket:"+filename)
        response=s3.get_object(Bucket='samplebucket',Key=filename)
        print("After S3 get_object")
        responsefile=response['Body'].read().decode('utf-8')
        responsefile=str(responsefile)

    except botocore.exceptions.ClientError as error:
        print("Boto Exception:"+str(error))
        responsefile="Page Not Found: "+filename+"  b-error:"+str(error)
    except botocore.exceptions.ReadTimeoutError as error:
        print("Boto read timeout Exception:"+str(error))
        responsefile="Page Not Found: "+filename+"  b-error:"+str(error)
    except botocore.exceptions.ConnectTimeoutError as error:
        print("Boto connect timeout Exception:"+str(error))
        responsefile="Page Not Found: "+filename+"  b-error:"+str(error)
    except Exception as e:
        print("Exception:"+str(e))
        responsefile="Page Not Found: "+filename+"  error:"+str(e)
    return responsefile

Cloudwatch Log…

2023-01-29T13:20:19.390-08:00   INIT_START Runtime Version: python:3.9.v16 Runtime Version ARN: arn:aws:lambda:us-west-2::runtime:xxxx

2023-01-29T13:20:20.059-08:00   START RequestId: XXX Version: $LATEST

2023-01-29T13:20:20.059-08:00   Exists:False

2023-01-29T13:20:20.059-08:00   Getting S3 File:index.html

2023-01-29T13:20:20.059-08:00   trying s3 get_object:samplebucket:index.html

2023-01-29T13:20:40.082-08:00   2023-01-29T21:20:40.082Z XXX Task timed out after 20.02 seconds

2023-01-29T13:20:40.082-08:00   END RequestId: XXX

2023-01-29T13:20:40.082-08:00   REPORT RequestId: XXX Duration: 20022.77 ms Billed Duration: 20000 ms Memory Size: 128 MB Max Memory Used: 80 MB Init Duration: 667.52 ms

A SUCCESSFUL log looks like this….and this happens 90% of the time…

2023-01-29T13:20:27.542-08:00   INIT_START Runtime Version: python:3.9.v16 Runtime Version ARN: arn:aws:lambda:us-west-2::runtime:XXX

2023-01-29T13:20:28.208-08:00   START RequestId: XXX Version: $LATEST

2023-01-29T13:20:28.208-08:00   Exists:False

2023-01-29T13:20:28.208-08:00   Getting S3 File:index.html

2023-01-29T13:20:28.208-08:00   trying s3 get_object:samplebucket:index.html

2023-01-29T13:20:28.512-08:00   After S3 get_object

2023-01-29T13:20:28.555-08:00   Exists:False

2023-01-29T13:20:28.555-08:00   Getting S3 File:mainhead.html

2023-01-29T13:20:28.555-08:00   trying s3 get_object:samplebucket:mainhead.html

2023-01-29T13:20:28.593-08:00   After S3 get_object

2023-01-29T13:20:28.594-08:00   Exists:False

2023-01-29T13:20:28.594-08:00   Getting S3 File:Header.html

2023-01-29T13:20:28.594-08:00   trying s3 get_object:samplebucket:Header.html

2023-01-29T13:20:28.644-08:00   After S3 get_object

2023-01-29T13:20:28.653-08:00   Exists:False

2023-01-29T13:20:28.653-08:00   Getting S3 File:loginform.html

2023-01-29T13:20:28.653-08:00   trying s3 get_object:samplebucket:loginform.html

2023-01-29T13:20:28.728-08:00   After S3 get_object

2023-01-29T13:20:28.729-08:00   index.html

2023-01-29T13:20:28.733-08:00   END RequestId: XXX

2023-01-29T13:20:28.733-08:00   REPORT RequestId: XXX Duration: 525.06 ms Billed Duration: 526 ms Memory Size: 128 MB Max Memory Used: 81 MB Init Duration: 665.54 ms

I tried to access a file in a bucket using python and and boto3.get_object. I was expecting it to read and it does 90% of the time. But sometimes it stops working for as long as a few hours. and then it starts working again for no discernable reason. I tried logging exceptions and lowering the timeout on the get_object call so the get method will timeout and fail so I can get an error message but it still doesn’t timeout in the indicated time.

2

Answers


  1. Chosen as BEST ANSWER

    The comment from John Rotenstein is the answer.

    He comments: Is your AWS Lambda function configured to connect to a VPC? If so, is there a reason for doing this? Your symptom sounds like it might be configured to connect to multiple subnets in a VPC, but not all subnets are necessarily Private Subnets. This can cause random timeouts if a Public Subnet is selected. The preferable situation is to not use a VPC, which then provides direct access to the Internet.

    That was the case, I had a public subnet, after deleting the public subnet no more downtime was experienced.

    Thanks John!


  2. This should not occur in most cases if you didn’t configure VPC for the Lambda Function

    and if you’ve configured a VPC, make sure the subnets you’ve attached to it have a route for NAT Gateway or you can use s3 vpc-endpoints.

    Once subnets have been configured correctly, this issue might resolve.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search