skip to Main Content

At work there is a Docker host with a pretty small /var/lib/docker which fills up pretty fast whenever a few of the docker build commands fail in a row. In particular because not all of the docker build commands use the following flags: --no-cache --force-rm --rm=true, the point of which (in my understanding) is to try to delete extra junk after successful or unsuccessful builds. You can find these flags if you visit the url https://docs.docker.com/engine/reference/commandline/build/ and the scroll down.

One issue we are having is that not everybody does docker build with the flags --no-cache --force-rm --rm=true and it is kind of hard to track down (silly, I know) but then also there may be some other causes for filling up /var/lib/docker that we have not caught. IT would not give us the permission to look inside that directory for better understanding, but we are able to run docker image prune or docker system prune and that seems to be a good solution to our problems, except for the fact that we run it manually for now, whenever things go bad.

We are thinking of getting ahead of the problem by a) running yes | docker image prune just about every time after an image is built. I wrote "just about" because it is hard to track down every repo that builds an image (successfully or not) but that is a separate story. Even if this command has some side-effect (such as breaking somebody else’s simultaneous docker build on the same Docker host, it would only run once in a while, thus the probability of a clash being low. The other approach being discussed is pretty much blindly adding yes | docker image prune to a cron job that runs say every 2 hours. If this command has potential negative side effects, then the damage would be more likely.

Why do I even think that another docker build might break? Well, I do not know it for a fact, or else I would not be asking this question. In an attempt to better understand the so called images that we sometimes end up with after a broken docker build, I read this often-cited article: https://projectatomic.io/blog/2015/07/what-are-docker-none-none-images/

My understanding is that a docker build that has not finished yet, ends up leaving some images on disk, which it could then clean up at the end, depending on the flags. However, if something (such as the command yes | docker image prune that is issued in parallel) deletes some of this intermediate image layers, then the overall build would also fail.

Is this true? If so, then what is a good way to keep /var/lib/docker clean when building many images.

P.S. I am not a frequent user of S.O. so please suggest ways of improving this question if it violates some rules.

3

Answers


  1. A docker image prune (without the -a option) will remove only dangling images, not unused images.

    As explained in "What is a dangling image and what is an unused image?"

    Dangling images are images which do not have a tag, and do not have a child image (e.g. an old image that used a different version of FROM busybox:latest), pointing to them.

    They may have had a tag pointing to them before and that tag later changed.
    Or they may have never had a tag (e.g. the output of a docker build without including the tag option).

    Intermediate image produced by a docker build should not be considered dangling, as they have a child image pointing to them.

    As such (to be tested), it should be safe to use yes | docker image prune while images are being built.

    Plus, Buildkit is now the default (moby v23.0.0) on Linux, and is made to avoid side effects with rest of the API (intermediate images and containers):

    At the core of BuildKit is a Low-Level Build (LLB) definition format. LLB is an intermediate binary format that allows developers to extend BuildKit. >
    LLB defines a content-addressable dependency graph that can be used to put together very complex build definitions.

    Login or Signup to reply.
  2. I tried to reproduce the described behavior with the following script. The idea is to start several docker build processes in parallel. During it also run several docker system prune processes in parallel.

    Dockerfile:

    FROM centos:7
    RUN echo "before sleep"
    RUN sleep 10
    RUN echo "after sleep"
    RUN touch /myfile
    

    test.sh:

    #!/bin/bash
        
    docker build -t test1 --no-cache . &
    docker build -t test2 --no-cache . &
    docker build -t test3 --no-cache . &
    docker build -t test4 --no-cache . &
    sleep 5
    echo Prune!
    docker system prune -f &
    docker system prune -f &
    sleep 15
    docker run --rm test1 ls -la /myfile
    docker run --rm test2 ls -la /myfile
    docker run --rm test3 ls -la /myfile
    docker run --rm test4 ls -la /myfile
    

    Running bash test.sh I get successful builds and prune. There was an exception from second prune process: Error response from daemon: a prune operation is already running which means that prune recognizes this conflict situation.

    Tested on docker version 19.03.12, host system centos 7

    Login or Signup to reply.
  3. Yes it is safe.
    Due to locking of image layers for build time, for base layers of other running images or for running containers.
    Made such things many times in parallel with running automated build pipelines, with running Kubernetes cluster, etc…

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search