I am using Chroma DB (0.4.8) in a Python 3.10 Flask REST API application. The application runs well on local developer machines (including Windows and OS X machines).
I am using the multi-stage Dockerfile
below to package the application in an image based on python:3.10-slim
(Debian 12 Bookworm). Images are built on Github Actions using the google-github-actions/deploy-cloudrun@v1
action:
FROM python:3.10-slim as base
ENV PYTHONFAULTHANDLER=1
PYTHONHASHSEED=random
PYTHONUNBUFFERED=1
WORKDIR /app
# -------------------------------------
FROM base as builder
ENV PIP_DEFAULT_TIMEOUT=100
PIP_DISABLE_PIP_VERSION_CHECK=1
PIP_NO_CACHE_DIR=1
POETRY_VERSION=1.6
RUN apt-get update --fix-missing && apt-get install -y --fix-missing build-essential
RUN pip install "poetry==$POETRY_VERSION"
COPY pyproject.toml ./
COPY chat_api ./chat_api
RUN poetry config virtualenvs.in-project true &&
poetry install --only=main --no-root &&
poetry build
# -------------------------------------
FROM base as final
COPY --from=builder /app/.venv ./.venv
COPY --from=builder /app/dist .
COPY docker-entrypoint.sh .
RUN ./.venv/bin/pip install *.whl
RUN ["chmod", "+x", "docker-entrypoint.sh"]
CMD ["./docker-entrypoint.sh"]
As I am using Poetry 1.6 to install the Python packages, here are the dependency specifications from my pyproject.toml
file:
[tool.poetry.dependencies]
python = "^3.10"
flask = "^2.3.3"
langchain = "^0.0.279"
flask-api = "^3.1"
openai = "0.27.8"
chromadb = "0.4.8"
tiktoken = "^0.4.0"
flask-sqlalchemy = "^3.0.5"
sqlalchemy = "^2.0.20"
pymysql = "^1.1.0"
google-cloud-logging = "^3.6.0"
flask-httpauth = "^4.8.0"
flask-cors = "^4.0.0"
gunicorn = "^21.2.0"
flask-migrate = "^4.0.4"
cryptography = "^41.0.3"
When I run the image in Google Cloud Run aor on a dev machine, the application loads successfully. However, as soon as a call is made to an endpoint that imports chromadb
, the process crashes with this traceback:
[ERROR] Worker (pid:3) was sent SIGILL!
Uncaught signal: 4, pid=3, tid=3, fault_addr=3.
Extension modules: google._upb._message, grpc._cython.cygrpc, charset_normalizer.md, _cffi_backend, markupsafe._speedups, sqlalchemy.cyextension.collections, sqlalchemy.cyextension.immutabledict, sqlalchemy.cyextension.processors, sqlalchemy.cyextension.resultproxy, sqlalchemy.cyextension.util, greenlet._greenlet, yaml._yaml, pydantic.typing, pydantic.errors, pydantic.version, pydantic.utils, pydantic.class_validators, pydantic.config, pydantic.color, pydantic.datetime_parse, pydantic.validators, pydantic.networks, pydantic.types, pydantic.json, pydantic.error_wrappers, pydantic.fields, pydantic.parse, pydantic.schema, pydantic.main, pydantic.dataclasses, pydantic.annotated_types, pydantic.decorator, pydantic.env_settings, pydantic.tools, pydantic, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, multidict._multidict, yarl._quoting_c, aiohttp._helpers, aiohttp._http_writer, aiohttp._http_parser, aiohttp._websocket, frozenlist._frozenlist, numexpr.interpreter (total: 56)
File "/app/.venv/bin/gunicorn", line 8 in <module>
File "/app/.venv/lib/python3.10/site-packages/gunicorn/app/wsgiapp.py", line 67 in run
File "/app/.venv/lib/python3.10/site-packages/gunicorn/app/base.py", line 236 in run
File "/app/.venv/lib/python3.10/site-packages/gunicorn/app/base.py", line 72 in run
File "/app/.venv/lib/python3.10/site-packages/gunicorn/arbiter.py", line 202 in run
File "/app/.venv/lib/python3.10/site-packages/gunicorn/arbiter.py", line 571 in manage_workers
File "/app/.venv/lib/python3.10/site-packages/gunicorn/arbiter.py", line 642 in spawn_workers
File "/app/.venv/lib/python3.10/site-packages/gunicorn/arbiter.py", line 609 in spawn_worker
File "/app/.venv/lib/python3.10/site-packages/gunicorn/workers/base.py", line 142 in init_process
File "/app/.venv/lib/python3.10/site-packages/gunicorn/workers/sync.py", line 126 in run
File "/app/.venv/lib/python3.10/site-packages/gunicorn/workers/sync.py", line 70 in run_for_one
File "/app/.venv/lib/python3.10/site-packages/gunicorn/workers/sync.py", line 32 in accept
File "/app/.venv/lib/python3.10/site-packages/gunicorn/workers/sync.py", line 135 in handle
File "/app/.venv/lib/python3.10/site-packages/gunicorn/workers/sync.py", line 178 in handle_request
File "/app/.venv/lib/python3.10/site-packages/flask/app.py", line 2213 in __call__
File "/app/.venv/lib/python3.10/site-packages/flask/app.py", line 2190 in wsgi_app
File "/app/.venv/lib/python3.10/site-packages/flask/app.py", line 1484 in full_dispatch_request
File "/app/.venv/lib/python3.10/site-packages/flask/app.py", line 1469 in dispatch_request
File "/app/.venv/lib/python3.10/site-packages/flask_httpauth.py", line 174 in decorated
File "/app/.venv/lib/python3.10/site-packages/redacted/routes.py", line 39 in messages_post
File "/app/.venv/lib/python3.10/site-packages/redacted/logic.py", line 25 in __init__
File "/app/.venv/lib/python3.10/site-packages/redacted/logic.py", line 40 in _load_vector_store
File "/app/.venv/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 119 in __init__
File "/app/.venv/lib/python3.10/site-packages/chromadb/__init__.py", line 143 in Client
File "/app/.venv/lib/python3.10/site-packages/chromadb/config.py", line 247 in instance
File "/app/.venv/lib/python3.10/site-packages/chromadb/api/segment.py", line 82 in __init__
File "/app/.venv/lib/python3.10/site-packages/chromadb/config.py", line 188 in require
File "/app/.venv/lib/python3.10/site-packages/chromadb/config.py", line 244 in instance
File "/app/.venv/lib/python3.10/site-packages/chromadb/config.py", line 293 in get_class
File "/usr/local/lib/python3.10/importlib/__init__.py", line 126 in import_module
File "<frozen importlib._bootstrap>", line 1050 in _gcd_import
File "<frozen importlib._bootstrap>", line 1027 in _find_and_load
File "<frozen importlib._bootstrap>", line 1006 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 688 in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 883 in exec_module
File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
File "/app/.venv/lib/python3.10/site-packages/chromadb/segment/impl/manager/local.py", line 13 in <module>
File "<frozen importlib._bootstrap>", line 1027 in _find_and_load
File "<frozen importlib._bootstrap>", line 1006 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 688 in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 883 in exec_module
File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
File "/app/.venv/lib/python3.10/site-packages/chromadb/segment/impl/vector/local_persistent_hnsw.py", line 9 in <module>
File "<frozen importlib._bootstrap>", line 1027 in _find_and_load
File "<frozen importlib._bootstrap>", line 1006 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 688 in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 883 in exec_module
File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
File "/app/.venv/lib/python3.10/site-packages/chromadb/segment/impl/vector/local_hnsw.py", line 21 in <module>
File "<frozen importlib._bootstrap>", line 1027 in _find_and_load
File "<frozen importlib._bootstrap>", line 1006 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 674 in _load_unlocked
File "<frozen importlib._bootstrap>", line 571 in module_from_spec
File "<frozen importlib._bootstrap_external>", line 1176 in create_module
File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
Current thread 0x00003ef4e1198b80 (most recent call first):
Fatal Python error: Illegal instruction
The last coherent (to me) line in the traceback points to line 21 in chromadb/segment/impl/vector/local_hnsw.py
which only contains import hnswlib
. I deduce that this is a failure in the installation of the chroma-hnswlib
package.
In the image’s virtual environment .venv/lib/python-3.10/site_packages
folder, I see the package as the folder chroma_hnswlib-0.7.2.dist-info
and an adjacent file called hnswlib.cpython-310-x86_64-linux-gnu.so
My question is – Why is my image failing to correctly install chroma-hnswlib
and how can I fix this?
UPDATE: I have modified my Dockerfile
so that it now uses a single stage. This means build-essentials
packages are now present in the resulting image. When I run the new image on my Windows machine (AMD Ryzen 7), the crash is no longer present. When I run the image in Google Cloud Run, the crash is reproduced.
UPDATE 2: Up until now the images I’ve used were built in Github Actions. I’ve made the experiment of building an image on my dev machine and deploying directly to Cloud Run – It works. I’m now investigating which type of CPU GH Actions is running the build on.
2
Answers
You wrote:
Prefer:
The trouble was that dynamically linked *.so libraries
were being installed in a location that you neglected to copy over,
leading to lossage.
Symlink
.venv
to/app
if need be.Or COPY it.
Or arrange for the relevant
/app
directory to appear in PYTHONPATH.I was able to get past my
illegal instruction
errors with Chroma by setting the environment variableHNSWLIB_NO_NATIVE=1
before runningpip install chromadb
. Looking at the source code, this removes use of the-march=native
compiler flag. As @CharlesDuffy indicates in a comment above,illegal instruction
indicates a difference in CPU features between where you built it and where you’re running it. So this workaround, if not ideal, at least makes sense.In my case, I’m building on AWS CodeBuild and running in Lambda. One idea to (maybe?) ensure the CPUs are the same is the new support for using Lambda runtimes in CodeBuild, but you can’t use
docker
commands there (a further suggestion is to usepodman
instead—maybe I’ll try it sometime, but for now 🤷♂️, things are working and it all seems fine).