skip to Main Content

I try to block some bots using RewriteEngine and htaccess. For DotBot and similar bots I found many scripts like:

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^DotBot 
RewriteRule ^.* - [F,L]

I understand everything with one exemption: Why most sites use ^DotBot instead of DotBot . I’m aware, ^ is the beginning of a string. In my logs, I found always user agents like:

Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, [email protected])

Use RewriteCond the whole string for testing? In my case, I think so. Only

RewriteCond %{HTTP_USER_AGENT} DotBot

works well. But have on my managed systems no way to test it directly. DotBot is a very short identifier and I try to block not the wrong agents.

Mario

2

Answers


  1. Chosen as BEST ANSWER

    Using a browser extension let me find an effective condition:

    RewriteEngine On
    RewriteCond %{HTTP_USER_AGENT} ^.*(DotBot|YandexBot|AhrefsBot|MegaIndex|SEOkicks|MJ12bot).*$
    RewriteRule ^ - [F,L]
    

    It works in the root directory. But all files into subdirectories are still ignored. I have no idea, which RewriteRule all content in the active directory and all subdirectories applies?!


  2. Could you please try following; this is looking for string bot using ignore case option(since we can’t be sure what else could be there with word bot so why not only look only string bot here), based on your shown samples/examples only and let me know if this helps you(I couldn’t test it).

    RewriteEngine On
    RewriteCond %{HTTP_USER_AGENT} bot [NC] 
    RewriteRule ^ - [F,L]
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search