skip to Main Content

An .htaccess features a set of rules to reject some ill formed urls as eg :

RewriteCond %{QUERY_STRING} (select|/**/) [NC]
RewriteRule ^ - [F,L]

How can i get a log of all rejected urls ?

Or how can i best log efficiently or temporily these rejected urls ?

[EDIT with more context :] My site sometimes goes down due to excesses of hackerbots attempts to find a way into it. To avoid that i have setup some rules in the .htaccess that reject the most common patterns found in hackerbots urls. This works fine, or at least it looks like it works fine. I now wish to (once every some time) check whether

  • some rules are useless and i could remove them
  • some rules are too broad and reject legitimate requests

So as to do so, I could build a script that applies the exact same rules (taken from the htaccess) to the apache access.logs that contain all requests. But it would require to sync the script everytime i update the htaccess. Hence, i wish to know if there is a setting or a "good" way to log all-and-only htaccess-rejected urls.

2

Answers


  1. Chosen as BEST ANSWER

    As @arkascha mentionned it, apache's handling status for each request is stated in apache's access.log So best is to get it from there.


  2. I begin to understand now with the additional comment you made above. What you ask is actually not clear from what you wrote in your question. You wrote "a log of all rejected urls", I understood of requested and rejected URLs, because that is what an http server deals with. But now I understand that you are actually not interested in URLs at all, but in a list of all possible query strings matching that condition. So we are talking about theoretical informatics here, artificial languages, a part of complexity theory.

    What you ask is not possible. Reason is that the list you ask for is infinitely large, obviously. So all you could do is setup an algorithm that creates one matching string after another along a specific rule set. But I dare say that this won’t really help, the actual rule set is probably more interesting for you….

    I would phrase it this way: your regular expression will match string that contains either one of the two substrings "select" or "/**/" anywhere, so at the beginning, in the middle or at the end, regardless of what is before and after it. Take a look at this: https://regex101.com/r/tHkqZE/1 In there "foo" and "bar" can be anything

    Maybe you want to limit that set. A first step, a probably step, would be to anchor the expression at the beginning or end of the full string or at the "&" character, considering the typical construction of a query string.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search