I am hosting a static site (purely html/css) on AWS S3 with a CloudFront distribution. I have no problem configuring only CloudFront to redirect HTTP to HTTPS. Nor do I have a problem only having S3 redirect www to a non-www (naked) subdomain.
The problem comes when I try to redirect all HTTP traffic to HTTPS and simultaneously redirect all www subdomains to non-www.
It simply doesn’t work. And I haven’t been able to find a solution to this problem and I’ve been looking for months. It may seem like StackOverflow has the answer, but I’m telling you it doesn’t. Either their solution reaches a dead-end or the solution is for an older AWS user interface that doesn’t quite match the way it is today.
The best I have been able to come up with is an HTML redirect for www to non-www, but that’s not ideal from an SEO and maintainability standpoint.
What is the best solution for this configuration?
2
Answers
As I mentioned in Supporting HTTPS URL redirection with a single CloudFront distribution, the simple and straightforward solution involves two buckets and two CloudFront distributions — one for www and one for the bare domain. I am highly skeptical that this would have any negative SEO impact.
However, that answer pre-dates the introduction of the CloudFront Lambda@Edge extension, which offers another solution because it allows you to trigger a Javascript Lambda function to run at specific points during CloudFront’s request processing, to inspect the request and potentially modify it or otherwise react to it.
There are several examples in the documentation but they are all very minimalistic, so here’s a complete, working example, with more comments than actual code, explaining exactly what it does and how it does it.
This function — configured as an Origin Request trigger — will fire every time there is a cache miss, and inspect the
Host
header sent by the browser, to see if the request should be allowed through, or if it should be redirected without actually sending the request all the way through to S3. For cache hits, the function will not fire, because CloudFront already has the content cached.Any other domain name associated with the CloudFront distribution will be redirected to the “real” domain name of your site, as configured in the function body. Optionally, it will also return a generated 404 response if someone accesses your distribution’s
*.cloudfront.net
default hostname directly.You may be wondering how the cache of a single CloudFront distribution can differentiate between the content for
example.com/some-path
andwww.example.com/some-path
and cache them separately, but the answer is that it can and it does if you configure it appropriately for this setup — which means telling it to cache based on selected request headers — specifically theHost
header.Normally, enabling that configuration wouldn’t be quite compatible with S3, but it works here because the Lambda function also sets the Host header back to what S3 expects. Note that you need to configure the Origin Domain Name — the web site hosting endpoint of your bucket — inline, in the code.
With this configuration, you only need one bucket, and the bucket’s name does not need to match any of the domain names. You can use whatever bucket you want… but you do need to use the web site hosting endpoint for the bucket, so that CloudFront treats it as a custom origin. Creating an “S3 Origin” using the REST endpoint for the bucket will not work.
Finishing up the other answer here using Lambda@Edge, I realized there is a significantly simpler solution, using only a single CloudFront distribution and three (explained below) S3 buckets.
There are more constraints to this solution, but it has fewer moving parts and costs less to implement and use.
Here are the constraints:
example.com
and a bucket namedwww.example.com
.dzczcexample.cloudfront.net
, and this bucket also must be in the same region as the other two.Configure the CloudFront distribution’s Origin Domain Name to point to your main content bucket using its web site hosting endpoint, e.g.
example.com.s3-website.us-east-2.amazonaws.com
.Configure the Alternate Domain Name settings for both
example.com
andwww.example.com
.Whitelist the
Host
header for forwarding to the origin. This setting takes advantage of the fact that when S3 does not recognize the incoming HTTPHost
header as being one that belongs to S3, then…Ummm… perfect! That’s exactly what we need — and it gives us a way to pass requests to multiple buckets in one S3 region, through a single CloudFront distribution, based on what the browser asks for… because with this setup, we’re able to split the logic:
Host
header is used when the request arrives at S3 for selecting which bucket handles the request.(This is why all the buckets have to be in the same region, as mentioned above. Otherwise, the request will be delivered to the region of the “main” bucket, and that region will reject it as misrouted if the identified bucket is in a different region.)
With this configuration in place, you’ll find that
example.com
requests are handled by theexample.com
bucket, andwww.example.com
requests are handled by thewww.example.com
bucket, which means all you need to do now is configure the buckets as desired.But there is one more critical step. You absolutely need to create a bucket named after your CloudFront distribution’s assigned default domain name (e.g.
d111jozxyqk.cloudfront.net
), in order to avoid setting up an exploitable scenario. It’s not a security vulnerability, it’s a billing one. It doesn’t make a great deal of difference how you configure this bucket, but it is important that you own the bucket so that nobody else can create it. Why? Because with this configuration, requests sent directly to your CloudFront distribution’s default domain name (not your custom domains) will result in S3 returning aNo Such Bucket
error for that bucket name. If someone else were to discover your setup, they could create that bucket, you’d pay for all their data traffic through your CloudFront distribution. Create the bucket and either leave it empty (so that an error is returned) or set it up to redirect to your main web site.