I have an Azure Function App inside an App Service Plan – Elastic Premium. The Function App is inside a VNet and it is a durable function app with some functions triggered by non-HTTP triggers:
I noticed the the Service Plan is not able to scale out properly, indeed the number of minimum instances is 4 and maximum instances 100
but checking the number of instances in the last 24 hours in:
"Function App -> Diagnose and solve problems -> Availability and performance -> HTTP Functions scaling"
the number of instances is never higher than 5/6 instances:
which is not good because if I check in:
"Function App -> Diagnose and solve problems -> Availability and performance -> Linux Load Average"
I have the following message:
"High Load Average Detected. Shows Load Average information if system has Load Average > 4 times # of Cpu’s in the timeframe. For more information see Additional Investigation notes below."
and also checking the CPU and memory usage from the metrics I have some high spikes
so this means that my App Service plan is not able to scale out properly.
One of my colleague, suggested to verify that:
"Function App -> Configuration -> Function runtime settings -> Runtime Scale Monitoring"
because if it is set to "off", may be that the VNet blocks Azure from diagnostic our app and as a result, Azure is not spawning more instances because is not seeing in real-time what CPU Load is.
But I didn’t understand how this setting can help me to scale out.
Do you know what the "Runtime Scale Monitoring" is used for and why this can help me to scale out?
And also, do you think is more a problem related to scaling up instead of scaling out?
2
Answers
FUNCTIONS_WORKER_PROCESS_COUNT
– This will Scale-up the Function Workers but not the hosts running.It means if you set the value to
X
, then each host instance runs the X individual function worker processes concurrently.This setting will scale up the host instance to run as many as Function Worker processes concurrently based on specified value.
Runtime Scale Monitoring
requires the Contributor Access Level on both App Service Plan and Function App Level to enable the setting for Scaling Up/down operation performing.I believe the above settings should need the pre-warmed instances to be at least set to
1
.For more information on how runtime scale controller works and cost optimization by enabling above settings, I have found the useful information in SO 1 and2 provided by the users @HariKrishna and @HuryShen.
Updated Answer:
In Azure Functions, an orchestrator function code can be either HTTP, Timer or any other Code Events, Sub-orchestrations, and activity functions.
To deal with the multiple orchestrator functions in parallel, you can use the below setting under
extensions
>durableTask
>maxConcurrentOrchestratorFunctions
toX
Value in thehost.json
file. Refer to the JonasW Blog Article regarding how scaling happens in Azure Durable Functions.I assume that you are not using HTTP triggers, but instead something like ServiceBus. Also I don’t think there is any "scaling up" in Consumption or Premium plans.
https://learn.microsoft.com/en-us/azure/azure-functions/functions-networking-options?tabs=azure-cli#premium-plan-with-virtual-network-triggers
My understanding is that the setting will allow the scale controller to gain insight into what is triggering your function. So it is not scaling horizontally by looking at CPU usage, but rather by looking at the executed triggers to check if the messages are being processed fast enough.
https://learn.microsoft.com/en-us/azure/azure-functions/event-driven-scaling#runtime-scaling
If you disable the setting, the Scale Controller shown in the image will not have access to the queue length and you will not observe any scaling.