Background Info
I have a python application that uses langchain and Ollama. Running this locally works perfectly fine because I have the Ollama client running on my machine.
What I want to do is host this application on a serverless platform (like GCR, for example) and in order to do this I need to containerize the application. This is easy for the python side of the application but I am struggling to get the Ollama part right.
I need the Ollama client to be running in order for langchain to use the engine and I cannot figure out how to get this right in the dockerfile. I have tried multi stage builds and using the official Ollama docker image and downloading from source and all of these end up with the same issue; I get Ollama onto the container but if I then RUN ollama serve
I cannot do anything else as the rest of the code waits for the Ollama Server to complete.
I have tried using the nohup
command too when running the server but whenever I try to pull a model using ollama pull <model>
it always returns asking if Ollama is running. I have added waits to this and it still doesn’t work.
I have tried using a docker-compose.yml file but the example that I found did not do what I needed. I have tried using multi stage builds in the dockerfile to try build the Ollama server first and then use it in the second stage but this resulted in the same issues as building it in one stage.
I have tried using startup scripts and using those as entry points to get the containers going but they end up with the same errors where Ollama doesn’t start up and I can’t pull images.
Questions
-
My first question is: is it even possible to achieve what I am trying to do? Would it be better to rather host this type of application on a VM where I can install the required software?
-
My second question is: if it is possible, has anyone done something similar that they can shed some light or advice on?
Any advice or help with this would be really appreciated!
2
Answers
From an architectural perspective, I suggest installing and configuring ollama as a standalone service on a VM or bare-metal server. This setup can be managed through
systemctl status ollama
on Linux systems.The rationale behind this recommendation includes:
ollama
_as_a_service.Create a start_service.sh file:
In your Dockerfile:
Hope this may help you. This is how it worked for me,