skip to Main Content

I have written a Lambda function in AWS to download a file from an S3 location to /tmp directory (local Lambda space).
I am able to download the file however, the file size is changing here, not sure why?

    def data_processor(event, context):
        print("EVENT:: ", event)
        bucket_name = 'asr-collection'
        fileKey = 'cc_continuous/testing/1645136763813.wav'
    
        path = '/tmp'
        output_path = os.path.join(path, 'mydir')
        if not os.path.exists(output_path):
            os.makedirs(output_path)
    
        s3 = boto3.client("s3")
    
        new_file_name = output_path + '/' + os.path.basename(fileKey)
    
    
        s3.download_file(
            Bucket=bucket_name, Key=fileKey, Filename=output_path + '/' + os.path.basename(fileKey)
        )
    
        print('File size is: ' + str(os.path.getsize(new_file_name)))
    
        return None

Output:

File size is: 337964

Actual size: 230MB
downloaded file size is 330KB

I tried download_fileobj() as well
Any idea how can i download the file as it is, without any data loss?

2

Answers


  1. Chosen as BEST ANSWER

    Working with S3 resource instance instead of client fixed it.

    s3 = boto3.resource('s3')
    keys = ['TestFolder1/testing/1651219413148.wav']
    for KEY in keys: 
        local_file_name = '/tmp/'+KEY
        s3.Bucket(bucket_name).download_file(KEY, local_file_name)
                  
    

  2. The issue can be that the bucket you are downloading from was from a different region than the Lambda was hosted in. Apparently, this does not make a difference when running it locally.

    Check your bucket locations relative to your Lambda region.

    Make a note that setting the region on your client will allow you to use a lambda in a different region from your bucket. However if you intend to pull down larger files you will get network latency benefits from keeping your lambda in the same region as your bucket.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search