I have issues with AspiegelBot crawling one of the sites on a server, this results in a lot of cores getting used up. I’ve been trying to block the bot in both in the sites htaccess with no sucess. The bot still constantly appears in my access.log
114.119.165.232 - - [20/Apr/2020:07:38:40 +0200] "GET /tillbehor.html?size=98%2C422%2C423%2C1129%2C1378 HTTP/1.1" 301 296 "-" "Mozilla/5.0 (Linux; Android 7.0;) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; AspiegelBot)"
Here is some of what I’ve tried:
htaccess
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^.(Mb2345Browser|AspiegelBot|LieBaoFast|MicroMessenger|zh-CN|Kinza|Mb2345Browser).$ [NC]
RewriteRule .* - [F,L]
robots.txt
User-agent: *
Allow: /
Disallow: */shopby
#######################################
################ PAGES ################
#######################################
Disallow: /privacy-policy-cookie-restriction-mode/
Disallow: /terms/
#######################################
############# Block Bots ##############
#######################################
User-agent: MJ12bot
Disallow: /
User-agent: SemrushBot
Disallow: /
User-agent: SemrushBot-SA
Disallow: /
User-agent: rogerbot
Disallow:/
User-agent: dotbot
Disallow:/
User-agent: AhrefsBot
Disallow: /
User-agent: Alexibot
Disallow: /
User-agent: SurveyBot
Disallow: /
User-agent: Xenu's
Disallow: /
User-agent: Xenu's Link Sleuth 1.1c
Disallow: /
User-agent: AspiegelBot
Disallow: /
Am I missing something or writing something incorrectly? Ï’m kinda at a loss here.
2
Answers
I posted this as a comment but seeing as it's what solved this for me I will add it as an answer. I managed to get the bot blocked by blocking the starting IP sequence in the htaccess file. It might not be optimal way to do it but it worked.
Deny from 114.119.0.0/16
Try adding this to your .htaccess