I try to block some bots using RewriteEngine and htaccess. For DotBot
and similar bots I found many scripts like:
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^DotBot
RewriteRule ^.* - [F,L]
I understand everything with one exemption: Why most sites use ^DotBot
instead of DotBot
. I’m aware, ^
is the beginning of a string. In my logs, I found always user agents like:
Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, [email protected])
Use RewriteCond the whole string for testing? In my case, I think so. Only
RewriteCond %{HTTP_USER_AGENT} DotBot
works well. But have on my managed systems no way to test it directly. DotBot
is a very short identifier and I try to block not the wrong agents.
Mario
2
Answers
Using a browser extension let me find an effective condition:
It works in the root directory. But all files into subdirectories are still ignored. I have no idea, which RewriteRule all content in the active directory and all subdirectories applies?!
Could you please try following; this is looking for string
bot
using ignore case option(since we can’t be sure what else could be there with word bot so why not only look only stringbot
here), based on your shown samples/examples only and let me know if this helps you(I couldn’t test it).