We are deploying Azure Key Vaults with secrets in DevOps pipeline with terraform. In key vault module, we also add private endpoint. The service connection has Key Vault Admin IAM access on the subscription. The pipeline runs on a self-hosted agent. This agent has access to the newly deployed virtual network that private endpoint uses.
We get 403 error on the first run when the pipeline tries to deploy the secrets, without any change, the 2nd run would succeed. It’s like the first run just doesn’t include the IAM into account, but once it does on the 2nd run, it successfully creates secrets.
is there a way for the pipeline to "recognize" the IAM role when it was deployed the first time?
2
Answers
The challenges encountered with IAM (Identity and Access Management) roles and permissions arise due to the time required for their propagation across the system. This delay can lead to the failure of operations reliant on newly assigned permissions if executed immediately after applying IAM roles or policies. In Azure, there is a brief delay before role assignments become active, which may cause initial attempts to fail while later ones succeed.
One strategy is to implement a delay or retry mechanism in the pipeline, particularly after assigning IAM roles and before initiating operations that need those permissions. Regrettably, Terraform doesn’t inherently support a wait for IAM role propagation. Nonetheless, you can circumvent this constraint by utilizing Terraforms
null_resource
combined with alocal-exec
provisioner or external data sources, to either introduce a delay or execute a check that ensures the permissions have propagated.I tried a demo version of terraform configuration how it works.
My terraform configuration:
deployment succeeded:
Terraform has a built-in resource, which simplifies these scenarios. It’s called
time_sleep
.In your scenario, the code would look like this: