Amazon web services - Lookup Key from S3 Bucket using Boto

thom4s94
December 30, 2023
122 views
0 votes
2 Answers

I’ve taken a script written by Paul Davies about reingesting Splunk Logs from the AWS Cloud.

The

When my logs have failed to process in Kinesis Firehose they get placed in a backup S3 bucket. The Current format of the key is the following:

Folder/Folder/Year/Month/Day/HH/failedlogs

Example:

splunk-kinesis-firehose/splunk-failed/2023/01/01/01/failedlogs.gz

The key lookup in the script is set like this

key=urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')

Is there way to get all the files within the following S3 Bucket under the sub folder – splunk-kinesis-firehose or is there a better way of looping through all folders?

Answers

- PierreD
- December 27, 2023 at 4:48 pm
- 0 votes
0
As John Rotenshtein says, your Lambda function, if invoked by S3 trigger, will receive the key as part of the request. You could also invoke the Lambda manually and pass the key in the request.

But if, for some reason, you want to do a full (or partial) listing under a path, then please take a look at s3list() that I describe in this SO post. It is a fairly general S3 lister. In your case, you would call it with:
```
bucket = boto3.resource('s3').Bucket('bucket-name')
path = 'splunk-kinesis-firehose/splunk-failed'

for s3obj in s3list(bucket, path, list_dirs=False):
    key = s3obj.key
    ...
```
to get all the objects under that path, or, for example:
```
for s3obj in s3list(bucket, path, start='2023/05/01', end='2023/06', list_dirs=False):
    key = s3obj.key
    ...
```
to get just the files for the month of May 2023.

Note that s3list is a generator: you can start listing a trillion objects and stop whenever you like (internally, it goes by chunks of upo to 1000 objects per call to AWS).
Login or Signup to reply.

- JohnRotenstein
- December 27, 2023 at 10:41 pm
- 0 votes
0
To list objects in an Amazon S3 bucket you can use the client method: list_objects_v2 – Boto3 documentation:
```
import boto3

s3_client = boto3.client('s3')

response = s3_client.list_objects_v2(
    Bucket='your-bucket-name',
    Prefix='splunk-kinesis-firehose',
)
```
Or you can use the resource method, which is a bit more Pythonic:
```
import boto3

s3_resource = boto3.resource('s3')

bucket = s3_resource.Bucket('your-bucket-name')

for obj in bucket.objects.all():
    print(obj.key)
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Amazon web services – Lookup Key from S3 Bucket using Boto

Answers