I want to get the src of multiple images but their selectors don’t seem to work, like the elements are fake and aren’t actually in the page.
https://imgflip.com/memegenerator/Drake-Hotline-Bling
This is one of the pages, every page has one of my desired elements (the main image), each with the same selector.
I have tried multiple selectors like:
'#mm-preview-outer > div.mm-preview > img'
and 'img[alt="meme generator image preview"]'
but they don’t work.
I tested my code by scraping other elements and everything works but when I change the selector in the .$eval()
to my desired element it doesn’t work (no errors).
This is my code working correctly with a different selector:
const puppeteer = require('puppeteer');
(async()=>{
const browser = await puppeteer.launch({
headless:false,
defaultViewport:false,
userDataDir:'./tmp'
});
const page = await browser.newPage();
const page2 = await browser.newPage();
await page.goto('https://imgflip.com/memetemplates');
const boxes = await page.$$('.mt-boxes > .mt-box');
for(const box of boxes){
try {
const title = await page.evaluate((el) => el.querySelector('h3 > a').textContent, box);
const pageurl = await page.evaluate((el) => el.querySelector('a.mt-caption').getAttribute('href'), box);
await page2.goto(`https://imgflip.com${pageurl}`);
const imageurl = await page.$eval('img[alt="Imgflip Logo"]', el => el.src);
console.log('The source of',title,'is')
console.log(imageurl);
} catch(error){}
}
await browser.close();
})();
Technically all I need to do is to change 'img[alt="Imgflip Logo"]'
to 'img[alt="meme generator image preview"]'
but this doesn’t work.
2
Answers
To get
img[alt="meme generator image preview"]
you can use'img.mm-img'
or'img[class^=mm-img]'
selectors (note :^
means begins with)el.src
withel.getAttribute('src')
so change
to
in your code.
you can also re-write some of the other parts like below, even though they are not necessary for what you want.
If you need to get the already existing memes then you can do it this way..
Code :
try not to querySelect Alt or title attributes when you can use classes or ids.
this selector works in the console maybe it’ll help:
document.querySelector(‘.mm-preview img’).src