skip to Main Content

I am very new to Docker but enjoying the study. Recently I have tried to build an image for a simple ML app that I had build earlier. I have used python:3.11-slim base image and installed a few dependencies. After the final build, the image size turned out to be 1.13 GB. How is this happening?

Following is my Dockerfile:

FROM python:3.11-slim

EXPOSE 8080

ADD requirements.txt requirements.txt
RUN  pip install --no-cache-dir -r requirements.txt

WORKDIR /app

RUN apk del .build-deps

COPY . .

ENTRYPOINT ["streamlit", "run", "app.py", "--server.port=8080"]

=================================================================

Following is my requirements.txt:

joblib==1.4.2
numpy==1.26.4
scikit-learn==1.5.0
scipy==1.13.1
streamlit==1.35.0
xgboost==2.1.0

2

Answers


  1. In general, dependencies that you install usually pull further dependencies along them.

    In your example, XGBoost and scikit-learn are heavy packages that most likely pulled further binary files while installing.

    I ran your Dockerfile and at first glance noticed this layer of the build:
    Image layer of docker build

    It appears to be the main source of your large image.

    Looking a bit closer, these are the top 3 largest pip dependencies after building the image:

    240M    /usr/local/lib/python3.11/site-packages/xgboost
    129M    /usr/local/lib/python3.11/site-packages/pyarrow
    79M     /usr/local/lib/python3.11/site-packages/pandas
    [...]
    

    Some of them are Sub-dependencies of your requirements.txt

    Login or Signup to reply.
  2. You can check all the dependencies and its sizes by saving your output to separate buildinfo file:

    docker build . --no-cache > build_output.txt 2>&1
    

    Here we can see that only 2 dependencies took ~ 400mb.

    #7 44.94 Downloading nvidia_nccl_cu12-2.23.4-py3-none-manylinux2014_x86_64.whl (199.0 MB)
    #7 62.85    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 199.0/199.0 MB 11.3 MB/s eta 0:00:00
    #7 19.60 Downloading scipy-1.13.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (38.6 MB)
    #7 23.01    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 38.6/38.6 MB 11.4 MB/s eta 0:00:00
    #7 23.85 Downloading xgboost-2.1.0-py3-none-manylinux_2_28_x86_64.whl (153.9 MB)
    #7 37.49    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 153.9/153.9 MB 11.6 MB/s eta 0:00:00
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search