I’ve wrapped Ray in a web API (using ray start --head
and uvicorn with ray.init
). Now I’m trying to:
- Submit a job to Ray (via the API) and serialise the future and return to the user
- Later, let the user hit an API to see if the task is finished and return the results
The kicker is that being a multi-thread I can’t guarantee that the next call will be from the same thread. Here is what I naively though would work:
id = my_function.remote()
id_hex = id.hex()
Then in another request/invocation:
id = ray._raylet.ObjectID(binascii.unhexlify(id_hex))
ray.get(id)
Now this never gets finished (it times out) even though I know the future is finished and that the code works if I run it in the same thread as the original request.
I’m guessing this has to do with using another initialisation of Ray.
Is there anyway to force Ray to “refresh” a futures result from Redis?
2
Answers
Serializing a ray future to a string out-of-band and then deserializing it is not supported. The reason for this is because these futures are more than just IDs, they have a lot of state associated with them in various system components.
One thing you could do to support this type of API is have an actor that manages the lifetime of these tasks. When you start the task, you pass its ObjectID to the actor. Then, when a user hits the endpoint to check if it’s finished, it pings the actor which looks up the corresponding ObjectID and calls ray.wait() on it.
Getting objectID directly in your way can cause unexpected behaviors because of Ray’s ref counting / optimization mechanism. One recommendation is to use “detached actor”. You can create a detached actor and delegate the call in there. Detached actors will survive in the lifetime of Ray (unless you kill it), so you don’t need to worry about problems you mentioned. Yes. It can make the program a bit slower as it requires 2 hops, but I guess this overhead wouldn’t matter for your workload (client submission model).
https://docs.ray.io/en/latest/advanced.html?highlight=detached#detached-actors