In the same project, the Docker image I built on my MacBook Pro (13-inch, M1, 2020) is 1.56GB, while the image built on GitHub Actions is 11.7GB. Why is there such a difference, even though I used the same Dockerfile?
My code is very simple, it just includes many additional Python third-party packages. Please tell me what I should do.
dockerfile:
FROM python:3.11-slim AS builder
WORKDIR /app
RUN pip install poetry
COPY pyproject.toml poetry.lock /app/
RUN poetry config virtualenvs.create false
&& poetry install --no-dev --no-interaction --no-ansi COPY . /app
CMD ["python3", "main.py" ]
workflows:
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Log in to GitHub Container Registry
run: echo "${{ secrets.MY_TOKEN }}" | docker login ghcr.io -u ${{ github.actor }} --password-stdin
- name: Build Docker image
run: docker build -t ghcr.io/${{ github.repository_owner }}/${{ github.repository }}/${{env.IMAGE_NAME}}:latest .
- name: show image size
run: docker images
- name: Push Docker image to GHCR
run: docker push ghcr.io/${{ github.repository_owner }}/${{ github.repository }}/${{env.IMAGE_NAME}}:latest
Local Build
GitHub Actions Build
I want the Docker image built on GitHub Actions to be the same size as the image built locally.
3
Answers
I wrote a Python tool and discovered the issue: the Python third-party package "sentence-transformers" automatically downloads NVIDIA-related packages in Docker. This causes the Docker image to be quite large. So if you're encountering a similar problem, please check the Python third-party packages you've installed to see if they are automatically downloading something you're not aware of.
I cannot answer this directly, but I do know a very useful tool to figure this out.
https://github.com/wagoodman/dive
Dive allows you to inspect each layer of your docker image. That should allow you to tell where the difference between the two is coming from, and by that hopefully how to avoid it.
If I had to guess, it’s probably this command here:
COPY . /app
It is because different underlying architecture and caching, most likely. I would use try building the image for the same architecture in GitHub Actions using the
--platform
option:*This is just an example
Also, try using multi stage builds.