I am using Celery to run background jobs for my Django app, hosted on Heroku, with Redis as broker. and I want to set up task prioritization.
I am currently using the Celery default queue and all the workers feed from. I was thinking about implementing prioritization within the only queue but it is described everywhere as a bad practice.
The consensus on the best approach to deal with the priority problem is to set different Celery queues for each level of priority. Let’s say:
-
Queue1 for highest priority tasks, assigned to x workers
-
Queue2 the default queue, assigned to all the other workers
The first problem I see with this method is that if there is no high priority task at some time, I loose the productivity of x workers.
Also, let’s say my infrastructure scales up and I have more workers available. Only the number of “default” workers will be expanded dynamically. Besides, this method prevents me from keeping identical dynos (containers on Heroku) which doesn’t look optimized for scalability.
Is there an efficient way to deal with task prioritization and keep replicable workers at the same time?
2
Answers
For the answer, W1 and W2 are workers consuming high and low priority tasks respectively.
You can scale W1 and W2 as separate containers. You can have three containers, essentially drawn from the same image. One for the app, two for the workers. If you have higher number of one kind of task, only that container would scale. Also, depending on the kind of dyno you are using, you can set concurrency for the workers to use resources in a better way.
For your reference, this is something that I did in one of my projects.
I know this is a very old question, but this might be helpful for someone who is still looking for an answer.
In My current project, we have an implementation like this
All these queues have different tasks based on priority (low, medium & high). We have to implement the priority for our celery tasks.
A) Let’s assume this scenario in which we are more interested in processing the tasks based on priority.
B) Queue1 for highest priority tasks, assigned to x workers
Queue2 the default queue, assigned to all the other workers
We have to choose these two options based on our requirements, I hope this would be helpful,
Suresh.