I’m trying to detect for either of the following 2 options:
- A specific list of bots (FacebookExternalHit|LinkedInBot|TwitterBot|Baiduspider)
- Any bots that don’t support the Crawable Ajax Specification
I’ve seen similar questions (How to recognize Facebook User-Agent) but nothing that explains how to do this in Node and Express.
I need to do this in a format like this:
app.get("*", function(req, res){
if (is one of the bots) //serve snapshot
if (is not one of the bots) res.sendFile(__dirname + "/public/index.html");
});
3
Answers
What you can do is use the
request.headers
object to check if the incoming request contains any UA information specific to that bot. A simple example.Node
Express
You can check the header
User-Agent
in the request object and test its value for different bots,As of now, Facebook says they have three types of User-Agent header values ( check The Facebook Crawler ), Also twitter has a User-Agent with versions ( check Twitter URL Crawling & Caching ), the below example should cover both bots.
Node
Express
This node express middleware will analyze a bunch of different user agent strings and give you just a “bot==true” or “desktop==true” way to determine. I haven’t used it and the readme sounds like it was just a trial project so I don’t know how maintained it will be going forward, but it will detect all sorts of bots.
https://github.com/rguerreiro/express-device