skip to Main Content

I’ve been searching for npm packages but they all seem unmaintained and rely on the outdated user-agent databases. Is there a reliable and up-to-date package out there that helps me detect crawlers? (mostly from Google, Facebook,… for SEO) or if there’s no packages, can I write it myself? (probably based on an up-to-date user-agent database)

To be clearer, I’m trying to make an isomorphic/universal React website and I want it to be indexed by search engines and its title/meta data can be fetched by Facebook, but I don’t want to pre-render on all normal requests so that the server is not overloaded, so the solution I’m thinking of is only pre-render for requests from crawlers

3

Answers


  1. I have nothing to add for your search for npm packages. But your question for an up to date user agent database to do build your own package, I would recommend ua.theafh.net

    It has, in the moment, data up to Nov 2014 and as far as I know it is with more than 5.4 million agents also the largest search engine for user agents.

    Login or Signup to reply.
  2. The best solution I’ve found is the useragent library, which allows you to do this:

    var useragent = require('useragent');
    // for an actual request use: useragent.parse(req.headers['user-agent']);
    var agent = useragent.parse('Googlebot-News');
    
    // will log true
    console.log(agent.device.toJSON().family === 'Spider')
    

    It is fast and kept up-to-date pretty well. Seems like the best approach. Run the above script in your browser: runkit

    Login or Signup to reply.
  3. I found this isbot package that has the built-in isbot() function. It seams to me that the package is properly maintained and that they keep everything up-to-date.

    USAGE:

    const isBot = require('isbot');
    
    ...
    
    isBot(req.get('user-agent'));
    

    Package: https://www.npmjs.com/package/isbot

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search