I am creating a data ingestion pipeline using Cloud Run. My Cloud Run api gets called everytime a file is dropped in a GCS bucket via Pub Sub. I need to load some metadata that contains text for the data I am ingesting. This metadata changes infrequently. I obviously do not want to reload it in memory on every execution. What is my best option? What I have been able to research so far is:
Option 1
You can also cache objects in memory if they are expensive to
recreate on each service request. Moving this from the request logic
to global scope results in better performance.
https://cloud.google.com/run/docs/tips#run_tips_global_scope-java
In the example given at this link, does the heavyComputation function only get called once at cold start? What if I need to retrigger this function occasionally upon metadata update. I also find the following information troubling in that it seems to say there is no guarantee other instances will reuse the object or not.
In Cloud Run, you cannot assume that service state is preserved
between requests. However, Cloud Run does reuse individual container
instances to serve ongoing traffic, so you can declare a variable in
global scope to allow its value to be reused in subsequent
invocations. Whether any individual request receives the benefit of
this reuse cannot be known ahead of time.
Option 2
Use something like Redis or Cloud Memory Store that is updated by a cloud function any time there are changes. And all instances of cloud run api pull metadata information from Redis. Would this be less or more performant than option 1? Any other down sides to this?
If there are other better ways of doing this, I would be very interested.
Update 1: I thought about it some more and since my metadata is going to be different for each tenant, and each invocation of my cloud run code is going to ingest one file for one tenant, it would be a bad idea to load all tenants metadata at each execution even if its cached. I might run seperate cloud runs inside each tenant’s project though.
2
Answers
We start with your initial premise which is “I obviously do not want to reload it in memory on every execution”. For me, that isn’t an always true statement. If I implement a caching technology then, as a programmer, I have spent time getting it right and introduced opportunities for error and maintenance. If I have saved 100msecs per execution, how many thousands and thousands of executions would it take to break even on these costs vs the savings in increased execution time? I commonly take the simplest approach up-front and be prepared to monitor operations in the future and address improvements only if warranted.
That all said, let’s assume that you have determined you will be making a bazillion new file creation requests per second and want to optimize. The key to understanding how best to use Cloud Run is that it runs a whole container that can process concurrent requests. I believe the default is 80. What that means is that if 80 concurrent files were created, only one container instance would be created and 80 parallel events would be processed. If you are coding in Java, this would mean 80 concurrent threads all within the same JVM. Each thread would have concurrent addressability to a common global variable. Only if the 81st request arrived and none of the previous 80 had already completed would a new Cloud Run container be spawned.
What this tells me is that my first “improvement” would be to populate cache data in my JVM on first usage and keep it present for subsequent reuse. When do you populate your cache data? This would be your design choice. You could populate it all up front when the container first starts … this would be sensible if you know that the data will be used for each and every request. Alternatively, if you have multiple cacheable values, consider creating a map that contains your name/value pairs and having an accessor which returns a cached value (if present) or retrieves from slow storage, caches and the returns a value (if not originally present).
Regarding the first option (Option 1):
The
heavyComputation()
function here would be called only at cold start, each time a new Cloud Run container instance is created (when the maximum number of requests that can be sent in parallel to a given container instance is exceeded and therefore a new instnace is created).In order to address the second option (Option 2):
As of now, Cloud Run (fully managed) does not support Serverless VPC Access, and therefore a connection to Cloud Memorystore is not a possibility. Monitor the following Feature Request to get all the relevant information and updates from the Cloud Run product team to check when this feature will be available.
You can found some workarounds on this post and on the Feature Request already mentioned. They basically consist of: