Amazon web services - How can I ingest data from an AWS S3 bucket to AWS OpenSearch Serverless using AWS Lambda?

AA
June 13, 2023
134 views
0 votes
2 Answers

I’ve been trying to ingest data from an AWS S3 bucket to AWS OpenSearch Serverless using AWS Lambda but I can’t find any documentation on how to do it, any ideas? I’ve seen plenty of AWS OpenSearch Service examples but would they work the same way with Serverless? I’m a bit new to AWS

I’ve looked at the boto3 client

Answers

- Prabhat
- June 3, 2023 at 3:09 am
- 0 votes
0
If the data is already present then you would make calls to s3 using boto3 and read the files and then make calls to Opensearch endpoint using its _bulk endpoint.

If files can arrive later then you can have an event fired by s3 and you can have the same code (slightly modified) in lambda function that can push the data to opensearch.

What kind of data is this?

Login or Signup to reply.

- fucalost
- June 6, 2023 at 7:27 pm
- 0 votes
0
As Prabhat mentions, boto3 is certainly an option, however, the AWS SDK for Pandas (previously AWS Data Wrangler) is a super simple approach. I’ve used it extensively for moving data from CSVs, S3, and other locations into OpenSearch with ease.

Using the AWS SDK for Pandas, you might achieve what you’re looking for like this…
```
import awswrangler as wr
from opensearchpy import OpenSearch

items = wr.s3.read_json(path="s3://my-bucket/my-folder/")

# connect + upload to OpenSearch
my_client = OpenSearch(...)
wr.opensearch.index_df(client=my_client, df=items)
```
The AWS SDK for Pandas can iterate over chunks of S3 items, and there’s a tutorial on indexing JSON (and other file types) from S3 to OpenSearch.

Of course, you would need to create a Lambda layer to make the package available in your function. Here is a (pretty much) one-click script to do this.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Amazon web services – How can I ingest data from an AWS S3 bucket to AWS OpenSearch Serverless using AWS Lambda?

Answers