I have several websites running Django/Apache deployed in AWS ElasticBeanstalk.
My only problem is the hundreds of emails I receive every day with this subject:
[Django] ERROR (EXTERNAL IP): Invalid HTTP_HOST header: WHATEVER. You may need to add WHATEVER to ALLOWED_HOSTS.
That’s because I have properly configured the ALLOWED_HOSTS variable in my Django configuration, but I need Apache to block all invalid HTTP_HOST headers so they don’t even reach Django and get rid of these hundreds of emails every day.
There are dozens of examples of how to do that in Apache here and there, I know, but I haven’t found a single example of how to do it when deploying in AWS ElasticBeanstalk.
If you are not familiarised with AWS ElasticBeanstalk, just keep in mind that this system automatically creates a /etc/httpd/conf.d/wsgi.conf
file with some configurations made by Amazon, and Amazon can (and will) modify them in the future outside our control.
So, when we want to add some configuration to Apache, such as providing a redirection, the preferred way is providing in our project a YAML file defining a new standalone Apache config file, that will be taken into account apart from the wsgi.conf
file that Amazon creates automatically, like this:
files:
"/etc/httpd/conf.d/ssl_rewrite.conf":
mode: "000644"
owner: root
group: root
content: |
RewriteEngine On
<If "-n '%{HTTP:X-Forwarded-Proto}' && %{HTTP:X-Forwarded-Proto} != 'https'">
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [R,L]
</If>
Doing that, once we deploy our code, a /etc/httpd/conf.d/ssl_rewrite.conf
file will be created and used by Apache too (we are not editing the original wsgi.conf
file, we are just providing more configurations in a new file).
There is an amazing explanation of this pattern here: https://stackoverflow.com/a/38751648/1062587.
These YAML config files can create Apache config files or append something to existing files, but can’t make replacements or editions.
However, regarding to my problem, the original wsgi.conf
file already provides a section like this:
<VirtualHost *:80>
[...]
<Directory /opt/python/current/app/>
Require all granted
</Directory>
[...]
</VirtualHost>
Since I want to block all invalid host requests, I assume I need that to be changed to something like this (according to this answer: https://stackoverflow.com/a/43322857/1062587):
<VirtualHost *:80>
[...]
<Directory /opt/python/current/app/>
Require expr "%{HTTP_HOST} in {'whatever.com', 'www.whatever.com'}"
Options
</Directory>
[...]
</VirtualHost>
My problem is: YAML files can’t make editions. I can only provide new configurations with them.
So, I don’t know if I can provide some configuration in a standalone Apache config file to override the existing <VirtualHost> -> <Directory>
section, so I can just define the required code in a YAML file like the ones ElasticBeanstalk understands.
Otherwise, I will have to provide a script to make the replacement on the fly. I know how to do that, but I find it ugly. I am just asking if there is a more elegant solution.
2
Answers
Since I haven't found any better approach using YAML files, I am using a Python script to modify the original Apache
wsgi.conf
file in the most agnostic way I can think of (no matters if Amazon modifies it in the future).I share it here in case anyone can find it useful. With this approach, you don't have to hardcode the hosts whitelist anywhere in code.
.ebextensions/deploy.config
Notice that you can consider adding
leader_only: true
if you are not planning to change theDJANGO_ALLOWED_HOSTS
environment variable at any time (explained later) and if you understand the implications..ebextensions/apache_block_invalid_hosts.py
Notice that the output path of the edited file is
../wsgi.conf
and not/etc/httpd/conf.d/wsgi.conf
. Trust me on this, it works.Notice that I am using the
__
separator on purpose, instead of a comma. This is because I sometimes create/clone environments directly from my command line usingeb clone
while providing changes in the values of environment variables. If you do that, you can't include commas inside values, and there is no way to escape them.django-environ
library to read enviroment variables from the system):Apache looks at the
Host
HTTP request header when it decides on the name based routing: https://httpd.apache.org/docs/2.4/vhosts/name-based.htmlJust have a default virtual host serving static 404 page and allow to route to Django only when proper
Host
HTTP request header is specified.It is also documented in Django: https://docs.djangoproject.com/en/3.1/howto/deployment/checklist/#allowed-hosts