skip to Main Content

…I have a Django application with a lot of data and a MariaDB database running on a Raspberry Pi (OS = Debian 12). The application uses Daphne as Webserver because there are also Django Channels components in there (Websocket). Now I want to implement a backup feature, that automatically dumps the database, zips together the dump with the other data files and has the browser automatically download the ZIP-file. So I made a view for the download:

def downbackup(request): 
  if request.user.is_superuser:
    filename = '../temp/backup/backup.zip'
    sourcefile = open(filename, 'rb')
    return FileResponse(sourcefile)
  else:
    return(HttpResponse('No Access.'))

That view is called from the relevant template via url and verything is fine. Until we meet large files in real live. In this case (filesize around 6 GB) Daphne immediatly stops operation, and the Raspi crashes so deep that I have to switch power on and of. Also, in Monitorix I see huge memory consumption spikes after these crashes. But there is no error message in the Daphne logs, in the Django logs (incl. debug-setting) nor in the journalctl. Nginx (the reverse proxy) reports an uplink timeout (504). I think I learned from the Django documentation the FileResponse buffers the file to avoid memory consumption, but something must be wrong here. Any ideas, recommendations?

2

Answers


  1. You can use Django’s StreamingHttpResponse instead of FileResponse to efficiently handle large files.

    from django.http import StreamingHttpResponse
    import os
    
    def download_large_file(request):
        # Path to the large file
        file_path = '/path/to/your/large/file'
    
        def file_iterator(file_path, chunk_size=8192):
            with open(file_path, 'rb') as f:
                while True:
                    chunk = f.read(chunk_size)
                    if not chunk:
                        break
                    yield chunk
    
        response = StreamingHttpResponse(file_iterator(file_path))
        response['Content-Length'] = os.path.getsize(file_path)
        return response
    
    Login or Signup to reply.
  2. I would strongly advise not to use Django as a file server, it is not intended to use that way. Typically nginx is used for example with an X-Accel-Redirect [nginx-doc] such that Django essentially says where the file is, and nginx, can then return the file in a manner that only nginx provides the file when an X-Accel-Redirect is present.

    So essentially you work with:

    def downbackup(request):
        if request.user.is_superuser:
            response = HttpResponse()
            response['Content-Disposition'] = 'attachment; filename=backup.zip'
            response['X-Accel-Redirect'] = '/protected/temp/backup/backup.zip'
            return response
        else:
            return HttpResponse('No Access.')

    and then the nginx server thus serves a protected path with:

    location /protected/ {
      internal;
      alias   /path/one/above/tmp;
    }

    this will not only make sure there are no problems with nginx buffering and steaming, but can also result in the nginx caching certain files and thus boosting efficiency of the webserver.

    The internal keyword in the /protected/ config means that you can not make requests from outside to this path. So a person can not visit /protected/temp/backup/backup.zip, it can only be used for internal communication in nginx with the "sub-servers".

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search