i was trying to protect the main page because on google console my report on a querystring is visible like this example:
https://example.com/?s=something.g
i would like to 404 all querystring only on the main page "example.com/" but any other like the javascripts/css files, folders and wp-admin can use querystrings
this is not allowed (only on main page):
https://example.com/?anything=something
https://example.com/?anythingnew=something&anotherone=something
https://example.com/index.php?anything=something
but these urls should be allowed (all other should be good):
https://example.com/something.js?anything=something
https://example.com/folder/?anything=something
https://example.com/folder/anotherfolder/anyfile.php?anything=something
i was trying to do this:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9} /([^?]*)?
RewriteRule (.*) /$1? [R=404,L]
it appears that all querystrings as disallowed including the files and folders inside.
i also tried this:
RewriteCond %{QUERY_STRING} .+
RewriteRule (.*) /$1? [R=404,L]
same thing, nothing worked, the rule should only be in the main page. thanks in advance
2
Answers
You were not far from the solution:
Explained
The RewriteRule will take the path (without the query string)
as input. So, if you want to apply this rule only for the homepage
(with or without
index.php
) then you have to write a regularexpression such as
^(?:index.php)?$
:^
matches the beginning of the string, meaning "it shouldstart with" instead of just "it should contain".
$
matches the end of the string, meaning"it should finish with".
(?:)
is a non-capturing group. If you put()
then it’s acapturing group, which will generate a variable called
$1
.But we don’t need to capture this part to put it back in the new
rewritten URL as we can just put
-
to say "nothing to change"and generate the 404 error. Putting the question mark behind this
group means that it can be present or not. I’ve put
index.php
inside it to say that we can have it or not in the URL. The dot
has to be escaped because
.
means "any char" in a regularexpression pattern.
You might see someone write also
^/?(?:index.php)?$
to say thatit could be with or without the leading slash. But normally
Apache will always strip this leading slash before using it in
the RewriteRule test. So there’s no reason to put it as this test
will use a few CPU cycles for nothing.
The RewriteCond is only run if we enter the RewriteRule.
Here, we want to test if the query string is empty or not. This can
easily be done by matching any char one or several times with
.+
.It would work with or without the
^
and$
around. I preferputting them to show that the full query string must not be empty.
The below Rewrite configuration passed all of your provided test cases.
Firstly, the RewriteRule is ensuring that it is applied only to requests on your homepage. In these examples, the request and resulting string input to the Rewrite rule are:
As we prefixed the regex in the Rewrite rule with
!
, it will negate the result. Meaning only strings that do not have a/
will continue to the RewriteCond checks.Next, all
RewriteCond
lines need to evaluate as true for the RewriteRule to be applied. Here we have two:RewriteCond %{QUERY_STRING} .+
is checking to ensure that the query string is not empty by matching for 1 or more of any characterRewriteCond %{REQUEST_URI} !^/w+.(js|css)$
is checking that the URI is not requesting a file with javascript or css extension. This is another negated condition, so it is actually checking to see if the requested URI matches any word, followed by a literal.
, and either the wordcss
orjs
.There is an implied AND condition between these
RewriteCond
rules.As a bonus if required, you can enable additional logging to troubleshoot the Rewrite module by adding the below line to your Apache conf file.
Some examples of the output you’ll see taken from my testing:
Request:
/index.html?test=true
, Logs:Request:
/test.css?test=true
, Logs: