- HEROKU gives me H12 error on transferring the file to an API from my Django application (Understood it’s a long running process and there is some memory/worker tradeoff I guess so). I am on one single hobby Dyno right now.
- The function just runs smoothly for around 50MB file. The file itself is coming from a different source ( requests python package )
- The idea is to build a file transfer utility using Django app on HEROKU. The file will not gets stored in my app side. Its just getting from point A and sending to point B.
- Went through multiple discussions along with the standard HEROKU documentations, however I am struggling in between in some concepts:
- Will this problem be solved by background tasks really? (If YES, I am finding explanation of the process than the direct way to do it such that I can optimize my flow)
- As mentioned in standard docs, they recommend background tasks using RQ package for python, I am using Postgre SQL at moment. Will I need to install and manage Redis Database as well for this. Is this even related to Database?
- Some recommend using extra Worker other than the WEB worker we have by default. How does this relate to my problem?
- Some say to add multiple workers, not sure how this solve it. Let’s say today it starts working for large files using background tasks, what if the load of users at same time increases. How this will impact my solution and how should I plan the mitigation plan around the risks.
- If someone here has a strong understanding with respect to the architecture, I am here to listen your experiences and thoughts. Also, let me know if there is other way than HEROKU from a solution standpoint which will make this more easy for me.
2
Answers
Here is my final take on this after full evaluation, trials and earlier recommendations made here, thanks @arun.
Have you looked at using
celery
to run this as a background task?This is a very standard way of dealing with requests which take a large time to complete.
Yes it can be solved by background tasks. If you are using something like Celery which has direct support for django, you will be running another instance of your Django application but with a different startup command for Celery. It then keeps reading for new tasks to execute and reads the task name from the redis queue (or rabbitmq – whichever you use as the broker) and then executes that task and updates the status back to redis (or the broker you use).
You can also use flower along with celery so that you have a dashboard to see how many tasks are being executed and what are their statuses etc.
To use background task with Celery you will need to set up some sort of message broker like Redis or RabbitMQ
I dont think that would help for your use case
When you use celery, you will have to start few workers for that Celery instance, these workers are the ones who execute your background tasks. Celery documentation will help you with exact count calculation of these workers based on your instance CPU and memory etc.
I have worked on few projects where we used Celery with background tasks to upload large files. It has worked well for our use cases.