I have a spring boot app deployed on AWS EKS POD and have provisioned AWS MSK with IAM authentication they both are under the same VPC and roles has been configured as well as in MSK inbound rules the port 9098 has also being added.
To test connectivity between EKS and MSK i did telnet with broker name and port 9098 it was successfully connected as well when my run spring boot app in eks pod it gives the below error:
org.springframework.kafka.KafkaException: Send failed;nested exception in org.apache.kafka.common.errors. SaslAuthenticationException: [63a192cc-599-43e-bfe8-bc880e50c2e1]: Access Denied
org.apache. kafka.clients.Networkclient: [Producer clientId=producer-1] Connection to node -3 b-3.xxxx.xxxx.amazonaws.com/10.7.2.1:9098) failed authentication due to: [63a192cc-599-43e-bfe8-bc880e50
Created a role in IAM attached it to EKS pod and assigned the below policies:
{
"version": "2012-10-17",
"Statement": [
{
"Sid": "AllowMskAccessCluster",
"Effect": "Allow",
"Action": [
"kafka:ListScramSecrets",
"kafka:GetBootstrapBrokers",
"kafka:DescribeCluster",
"kafka-cluster:DescribeCluster",
"kafka-cluster:Connect",
"kafka-cluster:AlterCluster",
],
"Resource": "AWS_EKS_CLUSTER_ARN"
},
{
"Sid": "AllowMskAccessTopic",
"Effect": "Allow",
"Action": [
"kakfa-cluster:DescribeTopicDynamicConfiguration",
"kakfa-cluster:DescribeTopic",
"kakfa-cluster:DeleteTopic",
"kakfa-cluster:CreateTopic",
"kakfa-cluster:AlterTopicDynamicConfiguration",
"kakfa-cluster:AlterTopic",
],
"Resource": [
"arn:AWS_EKS_CLUSTER_ARN/*",
"*"
]
},
{
"Sid": "AllowMskAccessGroup",
"Effect": "Allow",
"Action": [
"kafka-cluster:DescribeCluster",
"kafka-cluster:DeleteGroup",
"kafka-cluster:AlterGroup",
],
"Resource": "AWS_EKS_CLUSTER_ARN/*"
}
]
}
{
"version": "2012-10-17",
"Statement": [
{
"Sid": "AllowMskAccessCluster",
"Effect": "Allow",
"Action": [
"kafka:ListScramSecrets",
"kafka:GetBootstrapBrokers",
"kafka:DescribeCluster",
"kafka-cluster:WriteDataIdempotently",
"kafka-cluster:Connect",
],
"Resource": "AWS_EKS_CLUSTER_ARN"
},
{
"Sid": "AllowMskAccessTopic",
"Effect": "Allow",
"Action": [
"kakfa-cluster:WriteData",
"kakfa-cluster:DescribeTransactionalId",
"kakfa-cluster:DescribeTopic",
"kakfa-cluster:AlterTransactionalId",
],
"Resource":"*"
},
{
"Sid": "AllowMskAccessGroup",
"Effect": "Allow",
"Action": "kakfa-cluster":DescribeGroup,
"Resource": "AWS_EKS_CLUSTER_ARN/*"
}
]
}
{
"version": "2012-10-17",
"Statement": [
{
"Sid": "AllowMskAccessCluster",
"Effect": "Allow",
"Action": [
"kafka:ListScramSecrets",
"kafka:GetBootstrapBrokers",
"kafka:DescribeCluster",
"kafka-cluster:Connect",
],
"Resource": "AWS_EKS_CLUSTER_ARN"
},
{
"Sid": "AllowMskAccessTopic",
"Effect": "Allow",
"Action": [
"kakfa-cluster:ReadData",
"kakfa-cluster:DescribeTopic",
],
"Resource": "*"
},
{
"Sid": "AllowMskAccessGroup",
"Effect": "Allow",
"Action": [
"kafka-cluster:DescribeGroup",
"kafka-cluster:AlterGroup",
],
"Resource": "AWS_EKS_CLUSTER_ARN/*"
}
]
}
My spring boot kafka config:
ssl.truststore.location=path to trust file
security.protocol=SASL_SSL
sasl.mechanism=AWS_MSK_IAM
sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;
sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler
im using this dependencies in my spring app:
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>sts</artifactId>
<version>2.16.13</version>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>apache-client</artifactId>
<version>2.16.13</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.13</artifactId>
<version>3.0.1</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>spring-kafka</artifactId>
</dependency>
<dependency>
<groupId>software.amazon.msk</groupId>
<artifactId>aws-msk-iam-auth</artifactId>
<version>1.0.0</version>
</dependency>
2
Answers
I would love to help more, but this policy is a little hard to validate.
Some of the actions don’t have the right resources on them you list EKS ARN for many
kafka:*
andkafka-cluster:*
that should use MSK ARNs, etc.There are syntax issues like
"Action": "kakfa-cluster":DescribeGroup,
If these are just typos, it should be sufficient to redact your account number in the policy but leave the rest in place of you can so we can see as much of the real policy. It is just not r3al useful as is.
Have you tried to run IAM policy simulator on this policy, and use it to simulate the
kafka-cluster:Connect
action on the ARN for your cluster? It should tell you if your policy gives you access or not.Have you tried to start with a policy that gives blanket access? If you can get in with a very loose policy, tighten it up from there until it breaks again to pinpoint the problem. e.g. a policy with a single statement that gives you all permissions to MSK for any resource. I would again test this with the policy simulator for any validation issues and that it really does give you access. If it works there, but not EKS then you might not be using a role that has access to this policy as you think.
See https://github.com/aws/aws-msk-iam-auth/pull/18/files and https://github.com/aws/aws-msk-iam-auth/compare/1.0.0…1.1.0 looking for the commit message "Add sts module to the implementation dependencies." where you can see com.amazonaws:aws-java-sdk-sts added as a dependency to the Gradle bundle.
Search in AWS CloudTrail for the AWS API event that is failing. It should have more information about what is failing exactly. I believe you can find it by the request id (UUID) in your error message, but you might have to look by service and action around the time you are seeing the error in your logs.
Use the
awsDebugCreds
option to debug which credentials are being used to authenticate to IAM. https://github.com/aws/aws-msk-iam-auth#finding-out-which-identity-is-being-usedsasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required awsDebugCreds=true;
Hope this helps. If not and you can provide a cleaned-up version of the policy, screenshots of the Policy Simulator showing it gives access, or other information I can take another look. You got this!
the msk cluster should create a role that allows to read/write to the cluster
kafka-cluster:*
.the created role should be assumed in your kafka connector this way:
and then, add to this policy the operation that you want to do with the kafka cluster:
hope that help you