skip to Main Content

I have a question about react-router and google cached pages in the results of google. In this case we have a SPA, which uses react-router (via browserHistory), the problem here is that: google cached page is a page wrapper, where the URL-a differs by the URL defined in the router of the SPA, in this case the routing of the application falls to the definition of a page not found.
(example )

and the cached result of SPA page by google, instead showing the content of the page is displayed component PageNotFoundApp (routing for page not found *).

Do you have any idea, what could be done about the resolving of the described problem?

2

Answers


  1. An option would be to intercept the routing logic by using the onEnter event

    const projectCanonnicalAddr = "http://localhost";
    function cacheQueryParser(query) {
        let out = '';
        if (typeof query === 'string') {
            out = query.split(':').pop().replace(/^[^/]*/, '');
        }
        return out;
    }
    function intercepPath(next, replace) {
        if (next.location.pathname === '/search' 
            && next.location.query.q 
            && next.location.query.q.indexOf('cache') === 0 
            && next.location.query.q.indexOf(projectCanonnicalAddr) > -1) {
                replace(null, cacheQueryParser(next.location.query.q));
        } 
    };
    

    After this, for the catch-all the route definition you can use something like this:

    <Route path="*" component={PageNotFoundApp.container} onEnter={intercepPath}/>
    

    Please note that using of the injected replace function would actually navigate the browser to the path provided as a second parameter. I have not tested this in the case with google cache and it might be a wrong implementation.
    As an option you could pass a valid state as the first parameter of this function.

    Login or Signup to reply.
  2. A solution to this problem might be bypass to loading of SPA if domain name is different. And it have a sense only when pages have fallback HTML version used when JavaScript disabled (see: https://web.dev/without-javascript/).

    For example, HTML pages of SPA have to a base href

    <html>
        <head>
            <base href="https://example.com">
            ...
    

    And index.js might be like this

    let head = document.getElementsByTagName('head')[0];
    let base = head.getElementsByTagName("base")[0];
    let domain = base.href.replace('https://', '');
    
    if (window.location.host === domain) {
      import('./App');
    } else {
      const root = document.getElementById("root");
      root.classList.remove('loader');
      // or something else
    }
    

    As a result, the search engines will index single page application, but if open the pages from the google cache, the fallback HTML pages will be shown.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search