Docker - buildx build -platform linux/amd64 significantly increases image size with poetry

MakotoMiyazaki
April 18, 2024
213 views
3 votes
3 Answers

Packages installed by poetry significantly increases the image size when it’s built for amd64.

I’m building a docker image on my host machine(MacOS, M2 Pro), which I want to deploy to an EC2 instance. Normal build will make an image size of 2GB, which is good. But it will result in system compatibility issue when deployed on EC2: WARNING: The requested image's platform (linux/arm64/v8) does not match the detected host platform (linux/amd64/v3) and no specific platform was requested. So I am trying a build with buildx command. However, it results in whopping 13GB, even though all I changed was a build command. I’d like to know why and how to reduce the size.

Here is the Dockerfile:

FROM python:3.11-slim

# for -slim version (it breaks if you don't comment out && apt-get clean)
RUN apt-get update && apt-get install -y 
    gfortran 
    libopenblas-dev 
    liblapack-dev 
    && apt-get clean 
    && rm -rf /var/lib/apt/lists/*

# Set environment variables to make Python and Poetry play nice
ENV POETRY_VERSION=1.7.1 
    PYTHONUNBUFFERED=1 
    PYTHONDONTWRITEBYTECODE=1 
    EXPERIMENT_ID=$EXPERIMENT_ID 
    RUN_ID=$RUN_ID


## Install poetry
RUN pip install "poetry==$POETRY_VERSION"

## copy project requirement files here to ensure they will be cached.
WORKDIR /app

COPY pyproject.toml ./

RUN poetry config virtualenvs.create false 
    && poetry install --no-interaction --no-dev --no-ansi  --verbose 
    && poetry cache clear pypi --all

With this pyproject.toml, you can reproduce the build.

[tool.poetry]
name = "malicious-url"
version = "0.1.0"
description = ""
authors = ["Makoto1021 <[email protected]>"]
readme = "README.md"

[tool.poetry.dependencies]
python = "^3.11"
numpy = "^1.26.4"
tld = "^0.13"
fuzzywuzzy = "^0.18.0"
scikit-learn = "^1.4.1.post1"
pandas = "^2.2.1"
mlflow = {extras = ["pipelines"], version = "^2.11.3"}
xgboost = "^2.0.3"
python-dotenv = "^1.0.1"
imblearn = "^0.0"
torch = "^2.2.2"
flask = "^3.0.3"
googlesearch-python = "^1.2.3"
whois = "^1.20240129.2"
nltk = "^3.8.1"

[tool.poetry.group.dev.dependencies]
ipykernel = "^6.29.3"
tldextract = "^5.1.2"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

And this command will build a 2GB image.

docker build -f ./docker/Dockerfile 
-t malicious-url-prediction-img:v1 .

And this will make a 13GB image

docker buildx build --platform linux/amd64 -f ./docker/Dockerfile 
-t malicious-url-prediction-img:v1-amd64 .

The image size stays small if I remove the RUN command poetry config virtualenvs.create..., even if I build the image for amd64. So I assume that poetry is causing this problem. However, it still is weird to have such a big difference in size by just changing the build context.

I went inside the huge image with /bin/bash and du command, and I found out that the total used space is just around 2MB. So it indicates that the large Docker image size issue might not be directly related to the files I’m adding to the image but rather to the base image and the layers created by package installations and configurations.

I also used dive to investigate the space usage, but doesn’t really tell what’s wrong.

I have a feeling that it’s coming from poetry Any advice?

FYI, this is how I run the container.

docker run --rm -p 7070:5000 -v $(pwd)/logs:/app/logs malicious-url-prediction-img:v1-amd64

EDITED:

changed Dockerfile to minimal example
added myproject.toml to reproduce the build
added my investigation on poetry

Answers

The problem is builds.

When you run AMD64 chances area there are already precompiled wheels you can use, so there’s no need to build anything. When you’re using ARM64 there are often a lot less wheels available, requiring you to build the project.

Builds take up space- they download libraries, compile artifacts, and link things together.

What you should do instead is use a multi stage build. Run the installation in one stage of the build process, and then move your created files into a new container.

For example (this isn’t tested, but is based on your file and should work with some tweaks):

FROM python:3.11-slim

# for -slim version (it breaks if you don't comment out && apt-get clean)
RUN apt-get update && apt-get install -y 
    gfortran 
    libopenblas-dev 
    liblapack-dev 
    && apt-get clean 
    && rm -rf /var/lib/apt/lists/*

# Set environment variables to make Python and Poetry play nice
ENV POETRY_VERSION=1.7.1 
    PYTHONUNBUFFERED=1 
    PYTHONDONTWRITEBYTECODE=1 
    EXPERIMENT_ID=$EXPERIMENT_ID 
    RUN_ID=$RUN_ID


## Install poetry
RUN pip install "poetry==$POETRY_VERSION"

## copy project requirement files here to ensure they will be cached.
WORKDIR /app

COPY poetry.lock pyproject.toml ./

RUN poetry config virtualenvs.create false 
    && poetry install --no-interaction --no-dev --no-ansi  --verbose 
    && poetry cache clear pypi --all


# Build our actual container now.
FROM python:3.11-slim

ENV POETRY_VERSION=1.7.1 
    PYTHONUNBUFFERED=1 
    PYTHONDONTWRITEBYTECODE=1 
    EXPERIMENT_ID=$EXPERIMENT_ID 
    RUN_ID=$RUN_ID


# Copy all of the python files built in the Builder container into this smaller container.
COPY --from=Builder /usr/local/lib/python3.11 /usr/local/lib/python3.11

# Copy the model directory to the Docker image
COPY mlartifacts/542535033067401306/2ba405d6d3144db6a5cd237509e53ef8 /app/mlartifacts/542535033067401306/2ba405d6d3144db6a5cd237509e53ef8/

# Copy the blacklist and whitelist data
COPY data/blacklist.txt data/whitelist.txt /app/data/

## Copy Python files
COPY *.py .
COPY utils/ ./utils/
COPY .env .

EXPOSE 7070

CMD ["poetry", "run", "flask", "run", "--host=0.0.0.0"]

This method is used for all of the Multi-Py projects (disclaimer: I’m the author of these), so you can look there for examples.

Some ISAs may produce smaller/larger builds. But increase amount in size in your case isn’t normal. Here I propose an small refactoring in your Dockerfile.

I use this trick allways, but I didn’t test here.

You need gfortran and some other packages to build your python packages. After build, they aren’t necessary and you can uninstall them.

RUN apt install -y gfortran liblapack liblapack-dev libopenblas libopenblas-dev
RUN build your packages
RUN apt purge gfortran lapack-dev openblas-dev
RUN apt autoremove --purge
RUN other-cleanups

But above codes does not reduce your docker images, since build dependencies exists in above layers. You must change above code to:

RUN apt install -y gfortran liblapack liblapack-dev libopenblas libopenblas-dev ; 
    build your packages ; 
    apt purge gfortran lapack-dev openblas-dev ; 
    apt autoremove --purge

This will decrease your image size considerably.

apt, pip and … (apt by default, deletes downloaded .deb files however)downloads files in cache directories. You can add --mount=type=cache to your RUN argument to cache them in seperate cache volumes, not in your images.

So here is your refactored Dockerfile (I add liblapack and libopenblas so they must stay after apt autoremove):

FROM python:3.11-slim

ENV POETRY_VERSION=1.7.1 
    PYTHONUNBUFFERED=1 
    PYTHONDONTWRITEBYTECODE=1 
    EXPERIMENT_ID=$EXPERIMENT_ID 
    RUN_ID=$RUN_ID

WORKDIR /app
COPY poetry.lock pyproject.toml ./

RUN --mount=type=cache,target=/var/cache/apt 
    --mount=type=cache,target=/root/.cache/pip 
    --mount=type=cache,target=/root/.cache/pypoetry/cache 
    set -eux ; 
    apt update && apt install -y gfortran libopenblas libopenblas-dev liblapack liblapack-dev ; 
    pip install "poetry==$POETRY_VERSION" ; 
    poetry install --no-interaction --no-dev --no-ansi  --verbose ; 
    poetry cache clear pypi --all ; 
    apt purge gfortran libopenblas-dev liblapack-dev ; 
    apt autoremove --purge ; 
    rm -rf /var/lib/apt/lists/*

COPY mlartifacts/542535033067401306/2ba405d6d3144db6a5cd237509e53ef8 /app/mlartifacts/542535033067401306/2ba405d6d3144db6a5cd237509e53ef8/
COPY data/blacklist.txt data/whitelist.txt /app/data/
COPY *.py .
COPY utils/ ./utils/
COPY .env .

EXPOSE 7070

CMD ["poetry", "run", "flask", "run", "--host=0.0.0.0"]

the difference between the arm64 and amd64 image size comes down to a conditional architecture-dependent dependency

torch on amd64 pulls in a series of nvidia* dependencies to support gpu development — but does not pull those in on other architectures: https://github.com/pytorch/pytorch/blob/7a6edb0b6644eb2b28650ea3be1c806e4a57e351/.github/workflows/generated-linux-binary-manywheel-main.yml#L51

nvidia-cuda-nvrtc-cu11==11.8.89; platform_system == 'Linux' and platform_machine == 'x86_64' ...

you can see this reflected in the wheel metadata from pypi:

$ curl --silent https://files.pythonhosted.org/packages/c3/33/d7a6123231bd4d04c7005dde8507235772f3bc4622a25f3a88c016415d49/torch-2.2.2-cp311-cp311-manylinux1_x86_64.whl.metadata | grep platform_system
Requires-Dist: nvidia-cuda-nvrtc-cu12 (==12.1.105) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cuda-runtime-cu12 (==12.1.105) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cuda-cupti-cu12 (==12.1.105) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cudnn-cu12 (==8.9.2.26) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cublas-cu12 (==12.1.3.1) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cufft-cu12 (==11.0.2.54) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-curand-cu12 (==10.3.2.106) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cusolver-cu12 (==11.4.5.107) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cusparse-cu12 (==12.1.0.106) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-nccl-cu12 (==2.19.3) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-nvtx-cu12 (==12.1.105) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: triton (==2.2.0) ; platform_system == "Linux" and platform_machine == "x86_64" and python_version < "3.12"

and in the aarch64 wheel as well:

$ curl --silent https://files.pythonhosted.org/packages/cd/fd/2121f53629c433589273a2e8f71d29705e98024e0abe2360e63b852e80bb/torch-2.2.1-cp311-cp311-manylinux2014_aarch64.whl.metadata | grep platform_system
Requires-Dist: nvidia-cuda-nvrtc-cu12 ==12.1.105 ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cuda-runtime-cu12 ==12.1.105 ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cuda-cupti-cu12 ==12.1.105 ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cudnn-cu12 ==8.9.2.26 ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cublas-cu12 ==12.1.3.1 ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cufft-cu12 ==11.0.2.54 ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-curand-cu12 ==10.3.2.106 ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cusolver-cu12 ==11.4.5.107 ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cusparse-cu12 ==12.1.0.106 ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-nccl-cu12 ==2.19.3 ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-nvtx-cu12 ==12.1.105 ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: triton ==2.2.0 ; platform_system == "Linux" and platform_machine == "x86_64" and python_version < "3.12"

these dependencies are not pulled in on arm64 (aarch64):

# du -hs /usr/local/lib/python3.11/site-packages/{triton,nvidia}
420M    /usr/local/lib/python3.11/site-packages/triton
2.8G    /usr/local/lib/python3.11/site-packages/nvidia

(there may be more as well, I only traced torch which I’m familiar with its dependency problem as I couldn’t build your docker image since I ran out of disk space!)

Please signup or login to give your own answer.

Click here to cancel reply.

Docker – buildx build –platform linux/amd64 significantly increases image size with poetry

Answers