I am trying to replicate data from an RDS MySQL instance to Redshift Serverless, using DMS Serverless.
Following AWS docs, I have set up the following:
Prerequisites for AWS Database Migration Service
- A Redshift Serverless instance, with subnets and security groups
- An RDS MySQL instance, with subnets and security groups
- A DMS Serverless replication, with subnets and security groups
- IAM roles and policies
- Endpoints in DMS for MySQL and Redshift Serverless, and a VPC endpoint for Redshift Serverless
All instances and endpoints are in the same VPC.
RDS MySQL is running. I can connect to it through a client and run queries.
Note: I have successfully replicated data from the same RDS instance to S3, using a separate replication.
Redshift Serverless is running. I can connect to it through Redshift query editor v2.
Network Config
- The security group (DMS-SG) for DMS has an Outbound rule of
All traffic/All/All/0.0.0.0/0
- The security group (Red-SG) for Redshift has an Inbound rule of
Redshift/TCP/5439/id of DMS-SG
- The security group (Red-SG) has an Inbound rule of
MYSQL/Aurora/TCP/3306/id of DMS-SG
(MySQL and Redshift use the same SG) - Endpoints in DMS for RDS MySQL and Redshift Serverless
- A VPC endpoint (interface type) for Redshift (Enhanced VPC routing is turned ON), as per AWS docs:
Configuring VPC endpoints as AWS DMS source and target endpoints
-
A DMS Subnet group, with at least 3 subnets in 3 different AZs, each
with a pool of free IPs, required for Redshift Serverless processes.
Can’t find the docs for that, but I got errors telling me to do that
when I created less than 3 subnets. -
A role named
dms-vpc-role
with the following permissions: -
Cloudwatch access:
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"logs:*"
],
"Effect": "Allow",
"Resource": "*"
}
]
}
AmazonDMSVPCManagementRole
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:CreateNetworkInterface",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeInternetGateways",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeVpcs",
"ec2:DeleteNetworkInterface",
"ec2:ModifyNetworkInterfaceAttribute"
],
"Resource": "*"
}
]
}
AmazonDMSRedshiftS3Role
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:CreateBucket",
"s3:ListBucket",
"s3:DeleteBucket",
"s3:GetBucketLocation",
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:GetObjectVersion",
"s3:GetBucketPolicy",
"s3:PutBucketPolicy",
"s3:GetBucketAcl",
"s3:PutBucketVersioning",
"s3:GetBucketVersioning",
"s3:PutLifecycleConfiguration",
"s3:GetLifecycleConfiguration",
"s3:DeleteBucketPolicy"
],
"Resource": "arn:aws:s3:::dms-*"
}
]
}
The bucket for Redshift is named dms-*
dms-s3-access
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:DeleteObject",
"s3:PutObjectTagging"
],
"Resource": [
"arn:aws:s3:::gs-xxx-xxx-bucket/*",
"arn:aws:s3:::dms-xx-xxxshift/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::gs-xxx-xxx-bucket",
"arn:aws:s3:::dms-xx-xxxshift"
]
}
]
}
And a trust policy as part of dms-vpc-role
:
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1",
"Effect": "Allow",
"Principal": {
"Service": "dms.amazonaws.com"
},
"Action": "sts:AssumeRole"
},
{
"Sid": "Stmt2",
"Effect": "Allow",
"Principal": {
"Service": "redshift-serverless.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
According to this SO post aws-dms-endpoint-connection-to-redshift-not-working
(and some AWS docs as well), I need 2 more roles:
dms-cloudwatch-logs-role
dms-access-for-endpoint
Not sure where to find the policy settings for these, or if they are covered by the policies I created above. I believe they are covered, but correct me if not.
My Redshift Serverless endpoint URL is xx-workgroup.123456781234.ca-central-1.redshift-serverless.amazonaws.com:5439/dev
.
As per AWS docs, this is the Server name to be specified in the DMS target endpoint definition.
Do I need to specify the port/DBname
as part of the endpoint URL, when I am specifying a different DBname in the Database Name
parameter of the endpoint config in DMS? Could this be conflicting?
Another thing which I can’t figure out is where to specify this Trust policy, as required by this AWS doc:
Target Redshift RSServerless
"PolicyVersion": {
"CreateDate": "2016-05-23T16:29:57Z",
"VersionId": "v3",
"Document": {
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"ec2:CreateNetworkInterface",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeInternetGateways",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeVpcs",
"ec2:DeleteNetworkInterface",
"ec2:ModifyNetworkInterfaceAttribute"
],
"Resource": "arn:aws:service:region:account:resourcetype/id",
"Effect": "Allow"
}
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "redshift-serverless.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
},
"IsDefaultVersion": true
}
}
As can be seen above, I have added this part
"Sid": "Stmt2",
"Effect": "Allow",
"Principal": {
"Service": "redshift-serverless.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
to my Trust policy above.
The Problem
Despite all these settings, I am getting the following error when I run my replication:
Test connection failed for endpoint 'ep-target-redshift' and replication config 'rep-mysql-to-redshift'.'
Also don’t see any Cloudwatch logs, but I believe they are only generated when the replication runs.
Would appreciate some help on what I’m missing. Let me know if any clarification is required.
2
Answers
I set up DMS provisioned and Redshift Provisioned, set up a DMS replication instance to test connections, but the connection between DMS and Redshift still failed.
Changing my Redshift cluster to be publicly accessible allowed connections, but that was not the solution I wanted.
Eventually, after some more digging, I found that the VPC endpoint for Redshift was missing an endpoint access policy, since DMS is version 3.5.1, Enhanced VPC routing is ON and Redshift is not publicly accessible, so it requires VPC endpoints.
Here's the endpoint and
PolicyDocument
, note that theServiceName
is region-specific:Hope this is helpful.
The role and policy configuration for
dms-access-for-endpoint
role:The role and policy configuration for
dms-cloudwatch-logs-role
role:Redshift Serverless endpoint URL is
xx-workgroup.123456781234.ca-central-1.redshift-serverless.amazonaws.com
without port and DB name in the URL. This should be specified under server name. The port and DB name should also be specified in DMS endpoint configuration.