One of our production magento 2.4.5 website was getting frequent down issues and we have checked the case in detail and we could see that the we are getting excessive crawler request from meta-externalagent. See sample log entry below.
57.141.0.9 - - [xx/Dec/2024:12:xx:09 +0530] "GET /our-xxxxxxxxxs/ms-xxx?karat=23&size=6%2C7%2C13%2C16%2C17%2C18%2C24 HTTP/1.1" 200 780515 "-" "meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler
While checking further further i could see from logs that we have received "64941" requests from "meta-externalagent/1.1" in 12 hours.
I can see lots of people are facing similar issues, But no clear solution is mentioned for magento version 2.4.5.
excessive traffic from facebookexternalhit bot
https://developers.facebook.com/community/threads/992798532416685/
Is there any possible option we can do some rate limiting for the meta crawler ? As we are doing Facebook ads we cannot completely block requests from meta-externalagent.
Currently i i have blocked meta-externalagent using 7G Firewall in nginx.
2
Answers
I have gone through lots of discussion topics and read the below as well. I have tried a combination of solution posted by Ivan Shatsky and below. It is working. I have to monitor the performance for few days.
https://github.com/kbourdakos/facebook-UA-facebookexternalhit-1.1---RateLimit-Using-nginx/blob/main/configuration
You can rate-limit this bot using nginx built-in rate limiting functionality. According to the "Meta Web Crawlers" Facebook artice, the
User-Agent
HTTP header for this bot can be eitheror
However, since the article may not list all possible
User-Agent
values, you should check your nginx access logs to be sure.To limit requests from this bot, you can add the following snippet to your nginx configuration:
It is recommended to include the
Retry-After
HTTP header in the HTTP 429 response; you can do it as follows:Although you can use regular expressions in the
map
block, e.g.:I do not recommend doing it due to the performance considerations (read this nginx support forum post for the details).