I’m currently facing a issue with Celery 5.3.4 that ask me to install backports module. the application is running in container (3.10.13-bullseye) with python version 3.10.13 on the debian 11 host. When I run the command celery -A app beat -l INFO
. I encounter the following error:
Here are the details of my setup:
Host OS Debian 11 :
- Python Version inside Container: 3.10.13-bullseye
- Celery Version: 5.3.4
- Docker: Running on both Debian 11 and Windows10(Docker Desktop)
On Debian11 Host
pip version:
root@5b45db7349aa:/app# pip3 --version
pip 23.3.1 from /usr/local/lib/python3.10/site-packages/pip (python 3.10)
root@5b45db7349aa:/app# python --version
Python 3.10.13
Error:
root@5b45db7349aa:/app# celery -A app beat -l info
celery beat v5.3.4 (emerald-rush) is starting.
__ - ... __ - _
LocalTime -> 2023-11-03 10:48:13
Configuration ->
. broker -> redis://redis:6379/0
. loader -> celery.loaders.app.AppLoader
. scheduler -> celery.beat.PersistentScheduler
. db -> celerybeat-schedule
. logfile -> [stderr]@%DEBUG
. maxinterval -> 5.00 minutes (300s)
[2023-11-03 10:48:13,939: DEBUG/MainProcess] Setting default socket timeout to 30
[2023-11-03 10:48:13,940: INFO/MainProcess] beat: Starting...
[2023-11-03 10:48:13,945: CRITICAL/MainProcess] beat raised exception <class 'ModuleNotFoundError'>: ModuleNotFoundError("No module named 'backports'")
Traceback (most recent call last):
File "/usr/local/lib/python3.10/shelve.py", line 111, in __getitem__
value = self.cache[key]
KeyError: 'entries'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/celery/apps/beat.py", line 113, in start_scheduler
service.start()
File "/usr/local/lib/python3.10/site-packages/celery/beat.py", line 634, in start
humanize_seconds(self.scheduler.max_interval))
File "/usr/local/lib/python3.10/site-packages/kombu/utils/objects.py", line 31, in __get__
return super().__get__(instance, owner)
File "/usr/local/lib/python3.10/functools.py", line 981, in __get__
val = self.func(instance)
File "/usr/local/lib/python3.10/site-packages/celery/beat.py", line 677, in scheduler
return self.get_scheduler()
File "/usr/local/lib/python3.10/site-packages/celery/beat.py", line 668, in get_scheduler
return symbol_by_name(self.scheduler_cls, aliases=aliases)(
File "/usr/local/lib/python3.10/site-packages/celery/beat.py", line 513, in __init__
super().__init__(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/celery/beat.py", line 264, in __init__
self.setup_schedule()
File "/usr/local/lib/python3.10/site-packages/celery/beat.py", line 541, in setup_schedule
self._create_schedule()
File "/usr/local/lib/python3.10/site-packages/celery/beat.py", line 570, in _create_schedule
self._store['entries']
File "/usr/local/lib/python3.10/shelve.py", line 114, in __getitem__
value = Unpickler(f).load()
ModuleNotFoundError: No module named 'backports'
Additional Information:
-
The same container, when run on Windows 10 with Docker Desktop, works without any issues.
-
I have checked the version of pip, python and celery is the same.
FROM python:3.10.13-slim-bullseye # Con questa riga Python stampa direttamente sulla console senza fare buffere dei messaggi ENV PYTHONUNBUFFERED 1 # viene passato da Docker compose file e per il momento del sviluppo viene sostiuito ed e true ARG DEV=false # Creazione del enviorment and upgrading pip RUN apt-get update RUN apt-get install -y gcc python3-dev build-essential curl nano RUN apt-get install -y postgresql-client libjpeg-dev libpq-dev #Download the desired package(s) RUN curl https://packages.microsoft.com/keys/microsoft.asc | tee /etc/apt/trusted.gpg.d/microsoft.asc RUN curl https://packages.microsoft.com/config/debian/11/prod.list | tee /etc/apt/sources.list.d/mssql-release.list RUN apt-get update RUN ACCEPT_EULA=Y apt-get install -y msodbcsql17 # optional: for unixODBC development headers RUN apt-get install -y unixodbc-dev # optional: kerberos library for debian-slim distributions RUN apt-get install -y libgssapi-krb5-2 RUN mv /etc/localtime /etc/localtime.old RUN ln -s /usr/share/zoneinfo/Europe/Rome /etc/localtime # locales #RUN echo "it_IT.UTF-8 UTF-8" >> /etc/locale.gen #RUN locale-gen COPY ./requirements.txt /tmp/requirements.txt COPY ./requirements.dev.txt /tmp/requirements.dev.txt RUN pip install --no-cache-dir -r /tmp/requirements.txt COPY ./compose/celery/celery_worker_start.sh /celery_worker_start.sh RUN sed -i 's/r$//g' /celery_worker_start.sh RUN chmod +x /celery_worker_start.sh COPY ./compose/celery/celery_beat_start.sh /celery_beat_start.sh RUN sed -i 's/r$//g' /celery_beat_start.sh RUN chmod +x /celery_beat_start.sh COPY ./compose/celery/celery_flower_start.sh /celery_flower_start.sh RUN sed -i 's/r$//g' /celery_flower_start.sh RUN chmod +x /celery_flower_start.sh RUN mkdir /app WORKDIR /app EXPOSE 8000 #esecuzuione del container come utente django #CMD ["run.sh"]v
uname -a #from the container: Linux 3865cf0f97ec 4.19.0-25-amd64 #1 SMP Debian 4.19.289-2 (2023-08-08) x86_64 GNU/Linux uname -a #from the Host: Linux VSRVDEB01 4.19.0-25-amd64 #1 SMP Debian 4.19.289-2 (2023-08-08) x86_64 GNU/Linux
2
Answers
The same source for a container isn’t guaranteed (and in fact is somewhat unlikely) to produce the same image. There are two pretty reasonable possibilities here
python:3.10.13-slim-bullseye
, was broken and a fix was pushed. This is not uncommon, but I’d also expect some other people to report issues. A new revision can (and likely would be) pushed under the same tag if there was a glitch fixed.In this case I’d go with the latter explanation.
Short solution (assuming this is the issue): Remove the source container images and re-pull them before re-building
BEWARE! The output must include
Pull complete
for each layer! Otherwise, the Docker daemon is re-using cached layers, which is what we want to avoid. If you’re having trouble removing all the layers, try ‘pruning’ the Docker images (see https://stackoverflow.com/a/44791684/1120802) or, in the most extreme case, remove all the docker engine resources and start again (see Docker image corruption? Remove layers?). Be careful, both are destructive actions!Next, re-build and set the
--no-cache
optionIf this doesn’t resolve the issue, see the "Not in image issue?" section below.
Full Explanation:
To troubleshoot an issue with a bad image, we can use
docker image ls
to find the ID of our target image:The image ID is the more precise identifier for the set of bytes composing a given Docker image. In my case,
3.10.13-slim-bullseye
has the (shortened) ID ofee6be26d226b
. If this were to differ between machines, we’d know that the image is different. For the sake of example, we can see how theubuntu:latest
image is different (on a different machine) before and after pulls:A given docker image is composed of ‘layers’ (you’ve probably seen them referenced with pushing to or pulling from a registry). After pulling an image, these layers reside on the local filesystem. Unfortunately, the contents of these layers can be corrupted when the image is first written to disk, when pulling into the final image, or any time in-between. As an example of how this could happen, let’s intentionally introduce corruption and examine how this appears. We’ll use the base image referenced in this post,
python:3.10.13-slim-bullseye
as an example.First, find the
UpperDir
of the image we’re using. This is a one-liner that packs a bit together, but outputs the location on our filesystem where some of our base container contents are stored:Neat, now let’s corrupt our image:
Finally, build a container using the referenced image. In this example I’m using the Dockerfile included below. Note the
--no-cache
!Finally, run the image and list the contents of the root filesystem.
Our flag,
hello-from-corruption-town
, is present our image! Let’s see what happens if we try re-pulling the image:We can see that re-pulling a corrupt image will not fix the image. In this example we have added a flag (
hello-from-corruption-town
) but this could also be corruption of some other form, whether that’s a bad binary, truncated file, or missing directory. The best solution is to remove all related images and download the image layers again.Wait, not in image issue?
Try building a minimal reproduction of the issue, such as starting with a minimal Dockerfile such as the following:
From here, iteratively tweak the Dockerfile until you can either reproduce the issue (and you’ll know what step caused it).
Maybe you are having some "old" scheduler data?
For me, removing the
celerybeat-schedule.bak
,celerybeat-schedule.dat
andcelerybeat-schedule.dir
files and restart the celery beat solved the issue.