Trying to get a celery-based scraper up and running. The celery worker seems to function on its own, but when I also run the celery beat server, the worker gives me this keyerror.
File "c:usersmyusername.virtualenvsdjango-news-scraper-dbqk-dk5libsite-packagesceleryworkerconsumerconsumer.py", line 555, in on_task_received
strategy = strategies[type_]
KeyError: 'core.tasks.scrape_dev_to'
[2020-10-04 16:51:41,231: ERROR/MainProcess] Received unregistered task of type 'core.tasks.scrape_dev_to'.
The message has been ignored and discarded.
I’ve been through many similar answers on stackoverflow, but none solved my problem. I’ll list things I tried at the end.
Project structure:
core -tasks
newsscraper -celery.py -settings.py
tasks:
import time
from newsscraper.celery import shared_task, task
from .scrapers import scrape
@task
def scrape_dev_to():
URL = "https://dev.to/search?q=django"
scrape(URL)
return
settings.py:
INSTALLED_APPS = [
'django.contrib.admin',
...
'django_celery_beat',
'core',
]
...
# I Added this setting while troubleshooting, got a new ModuleNotFound error for core.tasks
#CELERY_IMPORTS = (
# 'core.tasks',
#)
CELERY_BROKER_URL = 'redis://localhost:6379'
CELERY_BEAT_SCHEDULE = {
"ScrapeStuff": {
'task': 'core.tasks.scrape_dev_to',
'schedule': 10 # crontab(minute="*/30")
}
}
celery.py:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'newsscraper.settings')
app = Celery('newsscraper')
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
When I run debug for the celery worker, I see that celery doesn’t have the task I want (scrape_dev_to) registered. Shouldn’t the app.autodiscover_tasks() call in celery.py take care of this? Here’s the output:
. celery.accumulate
. celery.backend_cleanup
. celery.chain
. celery.chord
. celery.chord_unlock
. celery.chunks
. celery.group
. celery.map
. celery.starmap
I also get a ModuleNotFoundError when I try to add core.tasks to a CELERY_IMPORTS setting. This is my best guess for where the problem is, but I don’t know how to solve it.
Things I tried:
- Add core.tasks to a celery_imports setting. This causes a new error when I try to run the celery beat: ‘no module named ‘core.tasks’ ‘.
- Hardcoding the name in the task: name=’core.tasks.scrape_dev_to’
- Specified the celery config explicitly when calling the worker: celery -A newsscraper worker -l INFO -settings=celeryconfig
- Playing with my imports (from newsscraper.celery instead of from celery, for instance)
- Adding some config code to the init.py for the module containing tasks (already had it in the init.py for module containing settings and celery.py)
- Python manage.py check identified no issues
- Calling the work with core.tasks explicitly: celery -A core.tasks worker -l INFO
3
Answers
I had the same problem and this setup solved it for me.
in your settings
and
Docs ref for
imports
.This can occur when you configured a celery task and then removed it.
Just deconfigure the tasks and configure again
or
In settings.py, I have added the below line:
and it worked for me.