I am using HTTP triggered Python Azure Functions to build an API, which does some operations on a ~100MB data file and depending on the endpoint returns a different response. For performance reasons the data file needs to be cached in-memory (fetching from Blob Storage for every request is too slow). There would be around 15-20 users using the API at a given time, each processing a different data file. The idea was that each one of them would be communicating with one Functions instance, which would have his data cached in-memory. In order to achieve this, two things need to be possible:
- the HTTP request (function trigger) would need to specify which instance it is targetting, perhaps through instance’s ID
- if it is the first request in a "session", meaning that the data file has not been cached yet I need to be able to spawn a new Functions instance
I have looked through MS documentation and could not find anything so I assume it is not possible. However, I would also very much appreciate tips/guidance on alternative approaches to the problem.
2
Answers
AFAIK,
A New Function Instance will be invoked based on the number of requests (Scaling) and Hosting Plan Model such as in the Consumption Plan, an instance will be idle for 5 minutes and if any request comes within that time/session, that instance can be loaded for any event from the Http Trigger or other triggers.
In other Hosting Plans, always one of the function host instances will be in
warm-up
state and that helps/processes the 1st or next requests.However, a point is same function app host/instance can process the multiple requests in parallel if the function code is asynchronous.
And We can restrict the Function App hosts/instances to one with the help of
WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT
Application Setting in Azure Function App Configuration Menu as given in MS Q&A #718906 and Autoscaling helps if high Payload is coming to your function app where you can limit or increase the number of instances required.There is no built in way to do exactly what you describe. However I would also discourage to build such a system.
A very similar solution could be to leverage AAR Affinity and autoscale to achieve very similar results.