skip to Main Content

I am facing this issue, when I am running my celery worker inside a docker container it’s not picking tasks.

I am using Flask and celery.

Here are my logs when I run it without docker

[email protected] v4.4.2 (cliffs)

Darwin-18.2.0-x86_64-i386-64bit 2020-05-26 22:16:40

[config]
.> app:         __main__:0x111343470
.> transport:   redis://localhost:6379//
.> results:     redis://localhost:6379/
.> concurrency: 8 (prefork)
.> task events: ON

[queues]
.> celery           exchange=celery(direct) key=celery


[tasks]
  . load_data.scraping.tasks.scrape_the_data_daily
  . scrape the data daily

You can clearly see that my worker is finding the task but it’s not running the periodic task.

When I run the same command in docker here is what I am getting:

celery-worker_1  | /usr/local/lib/python3.6/site-packages/celery/platforms.py:801: RuntimeWarning: You're running the worker with superuser privileges: this is
celery-worker_1  | absolutely not recommended!
celery-worker_1  | 
celery-worker_1  | Please specify a different user using the --uid option.
celery-worker_1  | 
celery-worker_1  | User information: uid=0 euid=0 gid=0 egid=0
celery-worker_1  | 
celery-worker_1  |   uid=uid, euid=euid, gid=gid, egid=egid,
celery-worker_1  | [2020-05-26 18:54:02,088: DEBUG/MainProcess] | Worker: Preparing bootsteps.
celery-worker_1  | [2020-05-26 18:54:02,090: DEBUG/MainProcess] | Worker: Building graph...
celery-worker_1  | [2020-05-26 18:54:02,092: DEBUG/MainProcess] | Worker: New boot order: {Timer, Hub, Pool, Autoscaler, StateDB, Beat, Consumer}

So it’ looks like it’s not finding the app and the tasks.

But if I execute the command from the docker container, I can see that my tasks are found.

Here is how I set up my docker-compose

web:
    image: apis
    build: .
    command: uwsgi --http 0.0.0.0:5000 --module apis.wsgi:app
    env_file:
      - ./.env
    environment:
      - POSTGRES_HOST=db
      - CELERY_BROKER_URL=redis://redis:6379
      - CELERY_RESULT_BACKEND_URL=redis://redis:6379
    volumes:
      - ./apis:/code/apis
      - ./tests:/code/tests
      - ./load_data:/code/load_data
      - ./db/:/db/
    ports:
      - "5000:5000"
    links: 
      - redis
  redis:
    image: redis
  celery-beat:
    image: apis
    command: "celery -A apis.celery_app:app beat -S celerybeatredis.schedulers.RedisScheduler --loglevel=info"
    env_file:
      - ./.env
    depends_on:
      - redis
    links: 
      - redis
    environment:
      - CELERY_BROKER_URL=redis://redis:6379
      - CELERY_RESULT_BACKEND_URL=redis://redis:6379
      - CELERY_REDIS_SCHEDULER_URL=redis://redis:6379
      - C_FORCE_ROOT=true
    volumes:
      - ./apis:/code/apis
      - ./tests:/code/tests
      - ./load_data:/code/load_data
      - ./db/:/db/
    shm_size: '64m'
  celery-worker:
    image: apis
    command: "celery worker -A apis.celery_app:app --loglevel=debug -E"
    env_file:
      - ./.env
    depends_on:
      - redis
      - celery-beat
    links: 
      - redis
    environment:
      - CELERY_BROKER_URL=redis://redis:6379
      - CELERY_RESULT_BACKEND_URL=redis://redis:6379
      - CELERY_REDIS_SCHEDULER_URL=redis://redis:6379
      - C_FORCE_ROOT=true
    volumes:
      - ./apis:/code/apis
      - ./tests:/code/tests
      - ./load_data:/code/load_data
      - ./db/:/db/
    shm_size: '64m'

and the celery setup is like this…

from apis.app import init_celery
from celery.schedules import crontab
from apis.config import CELERY_REDIS_SCHEDULER_KEY_PREFIX, CELERY_REDIS_SCHEDULER_URL
from celery.task.control import inspect

app = init_celery()
app.conf.imports = app.conf.imports + ("load_data.scraping.tasks",)
app.conf.imports = app.conf.imports + ("apis.models.address", )

app.conf.beat_schedule = {
    'get-data-every-day': {
        'task': 'load_data.scraping.tasks.scrape_the_data_daily',
        'schedule': crontab(minute='*/5'),
    },
}
app.conf.timezone = 'UTC'
app.conf.CELERY_REDIS_SCHEDULER_URL = CELERY_REDIS_SCHEDULER_URL
app.conf.CELERY_REDIS_SCHEDULER_KEY_PREFIX = CELERY_REDIS_SCHEDULER_KEY_PREFIX

i = inspect()
print(10*"===", i.registered_tasks())

And celery is being initialized like this

def init_celery(app=None):
    app = app or create_app()
    celery.conf.broker_url = app.config["CELERY_BROKER_URL"]
    celery.conf.result_backend = app.config["CELERY_RESULT_BACKEND"]
    celery.conf.update(app.config)

    class ContextTask(celery.Task):
        """Make celery tasks work with Flask app context"""

        def __call__(self, *args, **kwargs):
            with app.app_context():
                return self.run(*args, **kwargs)

    celery.Task = ContextTask
    return celery

Basically I have 2 questions.

  1. 1rst one is why I am not getting the task when running inside the docker container?
  2. 2nd Why my tasks are not running?

Any ideas are welcomed.

2

Answers


  1. Chosen as BEST ANSWER

    Okay,

    I don't know why the worker logs are not displaying the task on docker and till now.

    But the problem was the scheduler beat I was using, for some weird reason, it was not sending schedule for the task.

    I just change the scheduler and I found this package, very well documented and it help me to achieve what I wanted.

    celery according to the documentation:

    from apis.app import init_celery
    from celery.schedules import crontab
    from apis.config import CELERY_REDIS_SCHEDULER_URL
    
    app = init_celery()
    app.conf.imports = app.conf.imports + ("load_data.scraping.tasks",)
    app.conf.imports = app.conf.imports + ("apis.models.address", )
    
    app.conf.beat_schedule = {
        'get-data-every-day': {
            'task': 'load_data.scraping.tasks.scrape_the_data_daily',
            'schedule': crontab(minute='*/60'),
        },
    }
    app.conf.timezone = 'UTC'
    app.conf.redbeat_redis_url = my redis url
    

    And I updated the script that run the beat with this:

    celery -A apis.celery_app:app beat -S redbeat.RedBeatScheduler --loglevel=info
    

  2. I cannot comment as I don’t have 50 karma. I’m willing to bet there is a networking issue present. Ensure all your containers are listening to the correct interface.

    What makes me think this is that your redis service in docker-compose isn’t declaring any networking parameters so the default will be used (which is localhost). This would mean that the redis container isn’t accessible from outside the container.

    After you docker-compose up run docker ps -a to see what interface redis is listening on.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search