skip to Main Content

We have multiple terraform scripts, that create/update hundreds of resources on azure. If we want to change anything on api management related resources, it takes ages and regularly even times out. Running it again sometimes solves issues, but also sometimes tells us, that the api we want to create already exists and stuff like that.

The customer is getting really annoyed by us providing unreliable update-scripts that cause quite some efforts for the operations team, that is responsible of deploying and running the whole product. Saving changes in the api management is also taking ages and running into errors when we use the azure portal.

Is there any trick or clue on how to improve our situation?

(This is going on for a while now and feels like getting worse and worse over the time)

2

Answers


  1. I’d start by using the Debugging options to sort out precisely which resources are taking the longest. You can consider breaking those out into a separate state, so you don’t have to calculate them each time.

    Next, ensure that the running process has timeouts set greater than those of terraform. Killing terraform mid-run is a good way to end up with a confused state.

    Aside from that, there are some resources for which you can provide Operation Timeouts. With those you can ensure terraform treats them as failed before the process running terraform kills it (if they are available).

    I’d consider opening a bug on the azurerm provider or asking in the Terraform Section of the Community Forum.

    Login or Signup to reply.
  2. Azure API Management is slow in applying changes because it’s a distributed service. An update operation will take time as it waits until the changes are applied to all instances. If you are facing similar issues in the portal it’s a sign that it has nothing to do with Terraform or AzureRM. I would contact Azure support, as they will have the telemetry to help you further.

    In my personal experience, a guaranteed way to get things stuck is to do a lot of small changes in succession without waiting for the previous ones to finish so I would start by checking that.

    Finally, if you find no help in the previous steps, I would try using Bicep/ARM to manage the APIM. Usually, the ARM deployment API is a bit more robust compared to the APIs used by Terraform & GO SDK.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search