In Azure I have a Web portal which connects to a .Net 6 service APIGateway/GetQuote which is deployed to a Docker container “GatewayContainer”. The “GetQuote” action calls another Web Service “Quote” with action “DoQuote”. The “Quote” service is in a container called “QuoteContainer”. There is also a Dapr sidecar component alongside each main container.
The problem: when the web portal calls APIGateway GetQuote it waits 60 seconds before returning to the browser, but the APIGateway calls Quote DoQuote up to 4 time, once every 15 seconds as DoQuote takes longer than 15 seconds sometimes. Why is this happening?
- I have used Swagger to call the APIGateway so there is no Front Door timeout at play.
- There is no http timeout set when calling the services – (so why 60 seconds and 15 seconds)
- there is no timeout set in the app settings.json
- The Dapr configuration does not have a timeout configured in Azure.
- There is only one revision running for each container.
- The timeout does not happen when I run the services locally in debug
This only started happening Monday 23rd Oct at 01:00:00 UTC. The DoQuote service could take up to 22 seconds in the past but the ApiGateway would wait for the Quote service to reply.
Any help greatly appreciated.
2
Answers
This was a bug released by Dapr, it has now been fixed and rolled out to Azure:
https://github.com/microsoft/azure-container-apps/issues/968
I’ve recently seen the exact same behavior. We have containers running dapr and when the underlying calls take longer than 15 seconds they are cancelled due to timeout. After timeout, Dapr sidecar initiates one or more retries to the underlying service.
I’m currently investigating this myself and my early conclusions are pointing in the direction of Dapr Resiliency that was released in version 1.10.
However there are no documentation stating the 15 seconds timeout.
https://docs.dapr.io/getting-started/quickstarts/resiliency/resiliency-serviceinvo-quickstart/
According to Container Apps Gitub issues the resiliency is not yet released (https://github.com/microsoft/azure-container-apps/issues/585), however in Application Insights and our Dapr logs we can see traces of resiliency taking place for service invocation calls.
I have not found any way to configure either timeouts or resiliency in Azure Container Apps at this point.