I have a rule which boils down to:
RewriteCond %{REQUEST_URI} ^(.+).html$
RewriteRule ^(.+).html$ $1 [R=302,L]
It won’t work without the first line, even though in the second line there is exactly the same regex. As I understand it, if there’s no “.html” at the end, RewriteRule won’t rewrite anything, so why it can’t work without that RewriteCond? Trying to access example.com/test/abcd.html gives an error in the server log:
[REWRITE] detected external loop redirection with target URL: /test/abcd, skip.
Here is the whole .htaccess file:
RewriteEngine On
# HTTPS everywhere and strip WWW
RewriteCond %{HTTPS} !=on
RewriteCond %{HTTP_HOST} ^www.(.+) [NC]
RewriteRule ^ https://%1%{REQUEST_URI} [L,R=301]
# if example.com/xxx is not directory AND example.com/xxx.html file exists
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.html -f
# rewrite example.com/xxx to example.com/xxx.html
# only if there's no slash at the end
RewriteRule ^(.*[^/])$ $1.html
# if example.com/xxx/ is not directory, rewrite to example.com/xxx
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ $1 [R=301,L]
# if xxx.html is not directory AND xxx.html file exists
# redirect from xxx.html to xxx
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} -f
# won't work without line below, even though both have ^(.+).html$ - can't understand why
RewriteCond %{REQUEST_URI} ^(.+).html$
RewriteRule ^(.+).html$ $1 [R=301,L]
2
Answers
EDIT: I was wrong. I wasn't even aware that my website is hosted on LiteSpeed Web Server (LSWS), which is somewhat compatible with Apache, but not in 100%. So, the following reasoning applies to LSWS, but not to Apache.
I've finally understood why it didn't work.
In the original version the file structure went like:
[L]
)/foo/bar.html/
to/foo/bar.html
(and stop as above)/foo/bar
to/foo/bar.html
(internally)/foo/bar.html
to/foo/bar
(and stop as in 1. and 2.)So, when
/foo/bar.html
was requested, it got matched by the 4. rule and redirected to/foo/bar
. Then the rewriting was started again, as a new request was made to/foo/bar
, and it was rewritten as/foo/bar.html
(3.). Then it went to the next rule - 4. (again) - and was redirected back to/foo/bar
, so yet another request was made, and the rewriting has started again, but then it was blocked by the server because it loops.There are two ways to fix that. The first way is to change the order of the last two operations:
[L]
)/foo/bar.html/
to/foo/bar.html
(and stop)/foo/bar.html
to/foo/bar
(and stop)/foo/bar
to/foo/bar.html
(internally)In this scenario, request for
/foo/bar.html
will be redirected to/foo/bar
(3.) as before, and in the new request it will be rewritten as/foo/bar.html
internally (4.) and that's all. It won't be redirected back to/foo/bar
because there are no redirections or other rules after 4.The second way is to add the
[L]
flag to the rule rewriting/foo/bar
to/foo/bar.html
which will give the same effect as changing the order. The rewriting will go like:/foo/bar.html/
to/foo/bar.html
(and stop)/foo/bar
to/foo/bar.html
(internally) (and stop)/foo/bar.html
to/foo/bar
(and stop)I'll go with the first way (reordering) as it will allow me to add other rules after the "/foo/bar to /foo/bar.html" rule.
The final (as for now...)
.htaccess
file:Your rules generate an infinite redirect loop. Indeed, something like
foo/bar.html
goes tofoo/bar
, which will go tofoo/bar.html
internally, which will go back tofoo/bar
, and so on.Following rules will prevent such a redirect loop (few improvements included):