I am using "puppeteer": "^19.11.1",
:
I created this function to press the consent button on this Page:
This is my function:
async function handleConsent(page, logger) {
const consentButtonSelector =
'#uc-center-container > div.sc-eBMEME.ixkACg > div > div > div > button.sc-dcJsrY.bSKKNx';
try {
// Wait for the iframe to load
await page.waitForSelector("iframe", { timeout: 3000 });
// Identify the iframe that contains the consent button
const iframeElement = await page.$(
'iframe[name="__tcfapiLocator"]'
);
if (iframeElement) {
const iframeContent = await iframeElement.contentFrame();
// Attempt to click the consent button within the iframe
const consentButton = await iframeContent.$(consentButtonSelector);
if (consentButton) {
await iframeContent.click(consentButtonSelector);
logger.info("Consent button clicked inside the iframe.");
} else {
logger.info("Consent button not found inside the iframe.");
}
} else {
logger.info("Iframe with the consent message not found.");
}
await page.waitForTimeout(3000); // Wait for any potential redirects or updates after clicking
} catch (error) {
logger.error(`An error occurred while handling consent: ${error}`);
}
}
My problem is that the selector is not found, even though I am trying to select the iframe.
Any suggestion on what I am doing wrong?
I appreciate your replies!
2
Answers
Consent button is not placed in iframe. It is placed inside
#shadow-root
.To access it, you need to get it’s host first and then get property
shadowRoot
and only then you can access it.Shadow host selector is
#usercentrics-root
However, consent is rendered async, so after getting it’s host, accept button can be not rendered yet, so you need to wait for consent button to exist, for example, by implementing
waitFor
function inside evaluate block.After clicking on button is good practice to wait until consent shadow host is hidden.
More about shadowDOM
This answer is correct in pointing out the element you want is in a shadow root, but the solution it provides should be avoided. You can use
>>>
to pierce shadow roots easily in Puppeteer:And even if you couldn’t use
>>>
andwaitForSelector
here,page.waitForFunction
is preferred over rewriting polling from scratch, which is hard to maintain and unreliable.However, I’m betting your actual goal on the site is not simply to click an accept button for the sake of it. Your actual goal is more likely to scrape data. But most of the critical data on the page is already there in the static HTML, not rendered asynchronously, so you should be able to scrape it easily without JS or clicking any buttons:
Output snippet:
This illustrates a common antipattern in web scraping, which is assuming you need to behave like the user would, with JS enabled, and by dutifully clicking buttons. Much of the time, there’s a more direct approach that’s faster to run and write, and more reliable–basically better by any metric.
At this point, you can even skip Puppeteer entirely and use native
fetch
and a lightweight HTML parser like Cheerio:Output is the same, but Cheerio is faster:
If you want to scrape multiple pages, simply add a loop on the URL pagination rather than interacting with the UI:
See Puppeteer not giving accurate HTML code for page with shadow roots for a detailed overview of shadow roots in Puppeteer.
Disclosure: I’m the author of the linked blog post.