I have a psql RDS on the same AWS account where I am trying to set up a glue connection to it. I used the RDS option and selected my existing RDS, then I set up the network to be the same vpc,subnet and security groups as my RDS is hosted on.
So when the connection is ready, I tested it. But I got an error about the subnet of the vpc not finding an S3 endpoint.
I know from reading the documentation that I have to create a VPC endpoint towards the S3 to work as a gateway. Which I can do
But why is an S3 endpoint needed for a JDBC connection? What does the S3 have to do with reaching a database if we are already in the same network/subnet and have the correct security group to reach it? At what point in a JDBC connection does it needs a gateway to S3?
2
Answers
The documentation explains this:
When you set up a Glue job, under the
Job details
tab in theAdvanced properties
section you have to specify things likeScript path
,Temporary path
etc which point to S3 locations. So the Glue job needs to be able to access S3.When you associate a connection with your Glue job, it causes the Glue job to be run within the VPC specified in the connection. So now your Glue job needs a way to connect to S3 (which is external to your VPC) from within the VPC .
From this page of the Glue Developer Guide:
Try adding an S3 gateway endpoint to your VPC.