skip to Main Content

I have several websites running Django/Apache deployed in AWS ElasticBeanstalk.

My only problem is the hundreds of emails I receive every day with this subject:

[Django] ERROR (EXTERNAL IP): Invalid HTTP_HOST header: WHATEVER. You may need to add WHATEVER to ALLOWED_HOSTS.

That’s because I have properly configured the ALLOWED_HOSTS variable in my Django configuration, but I need Apache to block all invalid HTTP_HOST headers so they don’t even reach Django and get rid of these hundreds of emails every day.

There are dozens of examples of how to do that in Apache here and there, I know, but I haven’t found a single example of how to do it when deploying in AWS ElasticBeanstalk.

If you are not familiarised with AWS ElasticBeanstalk, just keep in mind that this system automatically creates a /etc/httpd/conf.d/wsgi.conf file with some configurations made by Amazon, and Amazon can (and will) modify them in the future outside our control.

So, when we want to add some configuration to Apache, such as providing a redirection, the preferred way is providing in our project a YAML file defining a new standalone Apache config file, that will be taken into account apart from the wsgi.conf file that Amazon creates automatically, like this:

files:
    "/etc/httpd/conf.d/ssl_rewrite.conf":
        mode: "000644"
        owner: root
        group: root
        content: |
            RewriteEngine On
            <If "-n '%{HTTP:X-Forwarded-Proto}' && %{HTTP:X-Forwarded-Proto} != 'https'">
            RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [R,L]
            </If>

Doing that, once we deploy our code, a /etc/httpd/conf.d/ssl_rewrite.conf file will be created and used by Apache too (we are not editing the original wsgi.conf file, we are just providing more configurations in a new file).

There is an amazing explanation of this pattern here: https://stackoverflow.com/a/38751648/1062587.

These YAML config files can create Apache config files or append something to existing files, but can’t make replacements or editions.

However, regarding to my problem, the original wsgi.conf file already provides a section like this:

<VirtualHost *:80>
  [...]
  <Directory /opt/python/current/app/>
    Require all granted
  </Directory>
  [...]
</VirtualHost>

Since I want to block all invalid host requests, I assume I need that to be changed to something like this (according to this answer: https://stackoverflow.com/a/43322857/1062587):

<VirtualHost *:80>
  [...]
  <Directory /opt/python/current/app/>
    Require expr "%{HTTP_HOST} in {'whatever.com', 'www.whatever.com'}"
    Options
  </Directory>
  [...]
</VirtualHost>

My problem is: YAML files can’t make editions. I can only provide new configurations with them.

So, I don’t know if I can provide some configuration in a standalone Apache config file to override the existing <VirtualHost> -> <Directory> section, so I can just define the required code in a YAML file like the ones ElasticBeanstalk understands.

Otherwise, I will have to provide a script to make the replacement on the fly. I know how to do that, but I find it ugly. I am just asking if there is a more elegant solution.

2

Answers


  1. Chosen as BEST ANSWER

    Since I haven't found any better approach using YAML files, I am using a Python script to modify the original Apache wsgi.conf file in the most agnostic way I can think of (no matters if Amazon modifies it in the future).

    I share it here in case anyone can find it useful. With this approach, you don't have to hardcode the hosts whitelist anywhere in code.

    1. First add a new command in .ebextensions/deploy.config
    container_commands:
        01__apache_block_invalid_hosts:
            command: python .ebextensions/apache_block_invalid_hosts.py
    

    Notice that you can consider adding leader_only: true if you are not planning to change the DJANGO_ALLOWED_HOSTS environment variable at any time (explained later) and if you understand the implications.

    1. Create the python script in .ebextensions/apache_block_invalid_hosts.py
    from enum import Enum, auto
    import os
    
    
    NEW_AUTH_DIRECTIVE = """
      Require expr "%{{HTTP_HOST}} in {{{hosts}}}"
      Options
    """
    
    class Step(Enum):
        BEFORE_AUTH = auto()
        INSIDE_AUTH = auto()
        AFTER_AUTH = auto()
    
    step = Step.BEFORE_AUTH
    with open('/etc/httpd/conf.d/wsgi.conf', 'r') as file_in, open('../wsgi.conf', 'w') as file_out:
        for line in file_in.readlines():
            if step == Step.BEFORE_AUTH:
                file_out.write(line)
                if "<Directory /opt/python/current/app/>" in line:
                    hosts = ", ".join([f"'{i}'" for i in os.environ['DJANGO_ALLOWED_HOSTS'].split('__')])
                    file_out.write(NEW_AUTH_DIRECTIVE.format(hosts=hosts))
                    step = Step.INSIDE_AUTH
            elif step == Step.INSIDE_AUTH:
                if "</Directory>" in line:
                    file_out.write(line)
                    step = Step.AFTER_AUTH
            elif step == Step.AFTER_AUTH:
                file_out.write(line)
    

    Notice that the output path of the edited file is ../wsgi.conf and not /etc/httpd/conf.d/wsgi.conf. Trust me on this, it works.

    1. Define an environment variable with all the whitelisted hosts in the AWS EB configuration website:
    DJANGO_ALLOWED_HOSTS         whatever.com__www.whatever.com__whatever.us-east-1.elasticbeanstalk.com
    

    Notice that I am using the __ separator on purpose, instead of a comma. This is because I sometimes create/clone environments directly from my command line using eb clone while providing changes in the values of environment variables. If you do that, you can't include commas inside values, and there is no way to escape them.

    1. Make use of the same envorinment variable from your Django settings file (I use the django-environ library to read enviroment variables from the system):
    import environ
    env = environ.Env()
    env.read_env()
    ALLOWED_HOSTS = env('DJANGO_ALLOWED_HOSTS', default='*').split('__')
    

  2. Apache looks at the Host HTTP request header when it decides on the name based routing: https://httpd.apache.org/docs/2.4/vhosts/name-based.html

    Just have a default virtual host serving static 404 page and allow to route to Django only when proper Host HTTP request header is specified.

    It is also documented in Django: https://docs.djangoproject.com/en/3.1/howto/deployment/checklist/#allowed-hosts

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search