skip to Main Content

I am trying to replicate data from an RDS MySQL instance to Redshift Serverless, using DMS Serverless.

Following AWS docs, I have set up the following:

Prerequisites for AWS Database Migration Service

  1. A Redshift Serverless instance, with subnets and security groups
  2. An RDS MySQL instance, with subnets and security groups
  3. A DMS Serverless replication, with subnets and security groups
  4. IAM roles and policies
  5. Endpoints in DMS for MySQL and Redshift Serverless, and a VPC endpoint for Redshift Serverless

All instances and endpoints are in the same VPC.

RDS MySQL is running. I can connect to it through a client and run queries.
Note: I have successfully replicated data from the same RDS instance to S3, using a separate replication.

Redshift Serverless is running. I can connect to it through Redshift query editor v2.

Network Config

  1. The security group (DMS-SG) for DMS has an Outbound rule of All traffic/All/All/0.0.0.0/0
  2. The security group (Red-SG) for Redshift has an Inbound rule of Redshift/TCP/5439/id of DMS-SG
  3. The security group (Red-SG) has an Inbound rule of MYSQL/Aurora/TCP/3306/id of DMS-SG (MySQL and Redshift use the same SG)
  4. Endpoints in DMS for RDS MySQL and Redshift Serverless
  5. A VPC endpoint (interface type) for Redshift (Enhanced VPC routing is turned ON), as per AWS docs:

Enhanced VPC Routing

Configuring VPC endpoints as AWS DMS source and target endpoints

  • A DMS Subnet group, with at least 3 subnets in 3 different AZs, each
    with a pool of free IPs, required for Redshift Serverless processes.
    Can’t find the docs for that, but I got errors telling me to do that
    when I created less than 3 subnets.

  • A role named dms-vpc-role with the following permissions:

  • Cloudwatch access:

    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "logs:*"
            ],
            "Effect": "Allow",
            "Resource": "*"
        }
    ]
}

AmazonDMSVPCManagementRole

    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterface",
                "ec2:DescribeAvailabilityZones",
                "ec2:DescribeInternetGateways",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSubnets",
                "ec2:DescribeVpcs",
                "ec2:DeleteNetworkInterface",
                "ec2:ModifyNetworkInterfaceAttribute"
            ],
            "Resource": "*"
        }
    ]
}

AmazonDMSRedshiftS3Role

    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:CreateBucket",
                "s3:ListBucket",
                "s3:DeleteBucket",
                "s3:GetBucketLocation",
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject",
                "s3:GetObjectVersion",
                "s3:GetBucketPolicy",
                "s3:PutBucketPolicy",
                "s3:GetBucketAcl",
                "s3:PutBucketVersioning",
                "s3:GetBucketVersioning",
                "s3:PutLifecycleConfiguration",
                "s3:GetLifecycleConfiguration",
                "s3:DeleteBucketPolicy"
            ],
            "Resource": "arn:aws:s3:::dms-*"
        }
    ]
}

The bucket for Redshift is named dms-*

dms-s3-access

    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:DeleteObject",
                "s3:PutObjectTagging"
            ],
            "Resource": [
                "arn:aws:s3:::gs-xxx-xxx-bucket/*",
                "arn:aws:s3:::dms-xx-xxxshift/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::gs-xxx-xxx-bucket",
                "arn:aws:s3:::dms-xx-xxxshift"
            ]
        }
    ]
}

And a trust policy as part of dms-vpc-role:

    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1",
            "Effect": "Allow",
            "Principal": {
                "Service": "dms.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        },
        {
            "Sid": "Stmt2",
            "Effect": "Allow",
            "Principal": {
                "Service": "redshift-serverless.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

According to this SO post aws-dms-endpoint-connection-to-redshift-not-working

(and some AWS docs as well), I need 2 more roles:

dms-cloudwatch-logs-role
dms-access-for-endpoint

Not sure where to find the policy settings for these, or if they are covered by the policies I created above. I believe they are covered, but correct me if not.

My Redshift Serverless endpoint URL is xx-workgroup.123456781234.ca-central-1.redshift-serverless.amazonaws.com:5439/dev.
As per AWS docs, this is the Server name to be specified in the DMS target endpoint definition.
Do I need to specify the port/DBname as part of the endpoint URL, when I am specifying a different DBname in the Database Name parameter of the endpoint config in DMS? Could this be conflicting?

Another thing which I can’t figure out is where to specify this Trust policy, as required by this AWS doc:
Target Redshift RSServerless

    "PolicyVersion": {
        "CreateDate": "2016-05-23T16:29:57Z", 
        "VersionId": "v3", 
        "Document": {
        "Version": "2012-10-17", 
        "Statement": [
            {
                "Action": [
                    "ec2:CreateNetworkInterface", 
                    "ec2:DescribeAvailabilityZones", 
                    "ec2:DescribeInternetGateways", 
                    "ec2:DescribeSecurityGroups", 
                    "ec2:DescribeSubnets", 
                    "ec2:DescribeVpcs", 
                    "ec2:DeleteNetworkInterface", 
                    "ec2:ModifyNetworkInterfaceAttribute"
                ], 
                "Resource": "arn:aws:service:region:account:resourcetype/id", 
                "Effect": "Allow"
            }
            {
                "Sid": "",
                "Effect": "Allow",
                "Principal": {
                    "Service": "redshift-serverless.amazonaws.com"
                },
                "Action": "sts:AssumeRole"
            }
        ]
    }, 
    "IsDefaultVersion": true
    }
}

As can be seen above, I have added this part

            "Sid": "Stmt2",
            "Effect": "Allow",
            "Principal": {
                "Service": "redshift-serverless.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }

to my Trust policy above.

The Problem

Despite all these settings, I am getting the following error when I run my replication:

Test connection failed for endpoint 'ep-target-redshift' and replication config 'rep-mysql-to-redshift'.'

Also don’t see any Cloudwatch logs, but I believe they are only generated when the replication runs.

Would appreciate some help on what I’m missing. Let me know if any clarification is required.

2

Answers


  1. Chosen as BEST ANSWER

    I set up DMS provisioned and Redshift Provisioned, set up a DMS replication instance to test connections, but the connection between DMS and Redshift still failed.

    Changing my Redshift cluster to be publicly accessible allowed connections, but that was not the solution I wanted.

    Eventually, after some more digging, I found that the VPC endpoint for Redshift was missing an endpoint access policy, since DMS is version 3.5.1, Enhanced VPC routing is ON and Redshift is not publicly accessible, so it requires VPC endpoints.

    Here's the endpoint and PolicyDocument, note that the ServiceName is region-specific:

    DMSVPCEndpointRedshift:
    Type: "AWS::EC2::VPCEndpoint"
    Properties:
      VpcEndpointType: "Interface"
      VpcId: !Ref DMSVPC
      ServiceName: !Sub 'com.amazonaws.${AWS::Region}.redshift'
      PolicyDocument: !Sub |
          {
            "Statement": [
                {
                    "Sid": "AccessSpecificAccount",
                    "Principal": {
                        "AWS": "${AWS::AccountId}"
                    },
                    "Action": "redshift:*",
                    "Effect": "Allow",
                    "Resource": "*"
                }
            ]
          }
    
      SubnetIds: 
      - !Ref EC2Subnet
      - !Ref EC2Subnet3
      PrivateDnsEnabled: true
      SecurityGroupIds: 
      - !Ref EC2SecurityGroup
    

    Hope this is helpful.


  2. The role and policy configuration for dms-access-for-endpoint role:

    DMSAccessEndpointRole:
        Type: AWS::IAM::Role
        Properties:
          AssumeRolePolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Principal:
                  Service:
                    - dms.amazonaws.com
                Action:
                  - 'sts:AssumeRole'
              - Effect: Allow
                Principal:
                  Service:
                    - redshift-serverless.amazonaws.com
                Action:
                  - 'sts:AssumeRole'
          ManagedPolicyArns:
            - arn:aws:iam::aws:policy/service-role/AmazonDMSRedshiftS3Role
          Path: /
          RoleName: "dms-access-for-endpoint"
    

    The role and policy configuration for dms-cloudwatch-logs-role role:

    DMSCloudWatchLogRole:
        Type: "AWS::IAM::Role"
        Properties:
          RoleName: dms-cloudwatch-logs-role
          ManagedPolicyArns:
            - arn:aws:iam::aws:policy/service-role/AmazonDMSCloudWatchLogsRole
          AssumeRolePolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: "Allow"
                Principal:
                  Service:
                    - "dms.amazonaws.com"
                Action:
                  - "sts:AssumeRole"
          Path: "/"
    

    Redshift Serverless endpoint URL is xx-workgroup.123456781234.ca-central-1.redshift-serverless.amazonaws.com without port and DB name in the URL. This should be specified under server name. The port and DB name should also be specified in DMS endpoint configuration.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search