skip to Main Content

I’m trying to make a script that downloads all the Google search images for making dataset of my ml project. I was following this tutorial to download the high-resolution image but suddenly an error appears which says:

Refused to load the script ‘https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js‘ because it violates the following Content Security Policy directive: “script-src ‘report-sample’ ‘nonce-Q6xQOKx7e+e0TlGbQFPX3g’ ‘unsafe-inline'”. Note that ‘script-src-elem’ was not explicitly set, so ‘script-src’ is used as a fallback

Some help would be greatly appreciated. I run this code by pasting it into the javascript console. Thanks!

var script = document.createElement('script');
script.src = "https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js";
document.getElementsByTagName('head')[0].appendChild(script);

// grab the URLs
var urls = $('.rg_di .rg_meta').map(function() {
  return JSON.parse($(this).text()).ou;
});

// write the URls to file (one per line)
var textToSave = urls.toArray().join('n');
var hiddenElement = document.createElement('a');
hiddenElement.href = 'data:attachment/text,' + encodeURI(textToSave);
hiddenElement.target = '_blank';
hiddenElement.download = 'urls.txt';
hiddenElement.click();

3

Answers


  1. I think you need to add something like this:

    <meta http-equiv="Content-Security-Policy" content="default-src https://cdn.example.net; child-src 'none'; object-src 'none'">
    

    Add it to Policies, there are many different ways (see the docs).

    Login or Signup to reply.
  2. Refused to load the script is because Content Security Policy. In Firefox you can disable csp via about:config in url bar and set security.csp.enable to false.

    I tried for testing with the code below in the Firefox console:

        javascript: (function(e, s) {
        e.src = s;
        e.onload = function() {
            jQuery.noConflict();
            console.log('jQuery injected');
    
            jQuery(".rg_i").get().forEach(function(entry, index, array) {
            var src = jQuery('.rg_i').attr('src');
            console.log("src1: " + src);
            });
    
            var src = jQuery('.rg_i').attr('src');
            console.log("src2: " + src);
        };
        document.head.appendChild(e);
    
    })(document.createElement('script'), '//ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js');
    

    Good luck 🙂

    Login or Signup to reply.
  3. You are using jQuery for something that can be done in native javascript.

    document.querySelectorAll works with selectors mainly as jQuery does. It does not return an array, but an (in my opinion) unwieldy NodeList.

    To get it to iterate properly, I prefer to spread it into an array and then call forEach on it.

    [...document.querySelectorAll('.foo')].forEach((element, index) => {
       console.log(element.innerText);
    });
    <div class="foo">bar</div>
    <div class="foo">baz</div>
    <div class="foo">bal</div>

    Also, the method of getting the data is diffent currently.

    On all the images you need to trigger a click first.
    This will activate javascript event handlers that will set the href of the image grandparent.
    You need let the google event handlers run first, so we detach the rest of our execution flow so the google script can do it’s thing and update the DOM. We do this with setTimeout().
    Then when the google scripts have run, the DOM elements have been updated, our scheduled timeouts get a chance to run, and now the href’s have been populated.

    Before the click the link looks like this:
    before click

    after click
    after click

    we now see that the href has been populated. The url that has been entered is:

    https://www.google.com/imgres?imgurl=https%3A%2F%2Fwww.researchgate.net%2Fprofile%2FJerome_Droniou%2Fpublication%2F305983658%2Ffigure%2Ffig5%2FAS%3A668650201690119%401536430039650%2FMesh-patterns-for-the-tests-using-the-HMM-method-left-Test-1-right-Test-2.png&imgrefurl=https%3A%2F%2Fwww.researchgate.net%2Ffigure%2FMesh-patterns-for-the-tests-using-the-HMM-method-left-Test-1-right-Test-2_fig5_305983658&tbnid=_UuLNMPCQAT0uM&vet=12ahUKEwjhsu31zcnoAhWbgKQKHR3jAdUQMygAegUIARDTAQ..i&docid=LThLi5REXoitfM&w=428&h=428&q=hmm%20test&ved=2ahUKEwjhsu31zcnoAhWbgKQKHR3jAdUQMygAegUIARDTAQ
    

    In this url we see after imgurl= something starting with https. This is our target image url, but it has been urlencoded and is part of a larger url.
    So we manipulate the string with some simple substring manipulation.

    Then we still have strange characters

    https%3A%2F%2Fwww.researchgate.net%2Fprofile%2FJerome_Droniou%2Fpublication%2F305983658%2Ffigure%2Ffig5%2FAS%3A668650201690119%401536430039650%2FMesh-patterns-for-the-tests-using-the-HMM-method-left-Test-1-right-Test-2.png

    for that we can use decodeURIComponent() to transform it into a normal url

    document.write(decodeURIComponent('https%3A%2F%2Fwww.researchgate.net%2Fprofile%2FJerome_Droniou%2Fpublication%2F305983658%2Ffigure%2Ffig5%2FAS%3A668650201690119%401536430039650%2FMesh-patterns-for-the-tests-using-the-HMM-method-left-Test-1-right-Test-2.png'))

    We then add this to our array.

    When we’ve handled everything, we create the urls file and download it.

    var urls = [];
    var count = 0;
    [...document.querySelectorAll('.rg_i')].forEach((element, index) => {
       let el = element.parentElement.parentElement;
       el.click();
       count++;
       setTimeout(() => {
           let google_url = el.href;
    
           let start = google_url.indexOf('=' , google_url.indexOf('imgurl'))+1;
           let encoded = google_url.substring(start, google_url.indexOf('&', start));
           let url = decodeURIComponent(encoded);
           urls.push(url);
           console.log(count);
           if(--count == 0) {
              let textToSave = urls.join('n');
              let hiddenElement = document.createElement('a');
              hiddenElement.href = 'data:attachment/text,' + encodeURI(textToSave);
              hiddenElement.target = '_blank';
              hiddenElement.download = 'urls.txt';
              hiddenElement.click();
           }
    
       }, 50);
    
    });
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search