skip to Main Content

Im currently trying to scrape the following Website.

I need to get different sub pages but some of them are hidden behind a load more button so I thought I use selenium to click the button but I cant get past the cookie button.

import requests 
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException

url = "https://www.sparhandy.de/handy-vertrag/"
driver = webdriver.Chrome("C:Web_driverchromedriver.exe")
driver.get(url)
driver.implicitly_wait(10)

try:
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="uc-center-container"]/div[2]/div/div/div/div/button'))).click()
    
except TimeoutException:
    print("Cookie button not clickable or page took too long to load.")

input("Press Enter to close the browser...")
driver.quit()

This is the code im using.

The button looks like this:

Button

This is what the button looks like in inspect element:
Button
(sorry for the image but I dont know how to copy only this part of the html code)

I expect selenium to find the button but instead it always times due to it not finding the button.

Thanks for any help.

P.S. Before you mark this as duplicate note that Ilready read all the other posts about this problem and I already implemented things I read.

2

Answers


  1. Try the below code to click on the cookie pop-up:

    element = driver.execute_script("""return document.querySelector('#usercentrics-root').shadowRoot.querySelector("button[data-testid='uc-accept-all-button']")""")
    element.click()
    
    Login or Signup to reply.
  2. You have 2 problems over there.

    1. Consent button is placed in shadow-root, so you need to get it via JS executor.
    2. Consent is appeared not directly after load, so you should wait for it inside js function.

    So, I wrote you a solution, where you can get rid from implicit_wait(10)

    driver.get('https://www.sparhandy.de/handy-vertrag/')
    wdwait = WebDriverWait(driver, 10)
    wdwait.until(EC.presence_of_element_located((By.ID, "usercentrics-root")))
    consentButton = driver.execute_script("function sleep(ms) { return new Promise(resolve => setTimeout(resolve, "
                                                "ms));} async function waitForConsent() {let consent;let tries = 20; "
                                                "while (!consent && tries > 0) {consent = document.querySelector("
                                                "'#usercentrics-root').shadowRoot.querySelector('[role=dialog] "
                                                "button');tries--;await sleep(500);} return consent;} return await waitForConsent()")
    consentButton.click()
    wdwait.until(EC.invisibility_of_element(consentButton))
    

    More readable js function in executor:

    function sleep(ms) {
      return new Promise(resolve => setTimeout(resolve, ms));
    }
    
    async function waitForConsent() {
      let consent;
      let tries = 20;
      while (!consent && tries > 0) {
        consent = document.querySelector('#usercentrics-root').shadowRoot.querySelector('[role=dialog] button');
        tries--;
        await sleep(500);
      }
        return consent;
    }
    
    return await waitForConsent();

    Function itself goes through DOM every 500 ms and checks if button in shadow root is exist. Number of tries: max 20.
    When button is appeared, it returns it, so you are able to click.

    With this approach you wait for real condition. Using implicit_wait is not the best practice.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search