skip to Main Content

I’m trying to get the href attribute that is under a specific class
This is an example of the HTML

This class and href is repeated multiple times throughout the page

I have tried the below, however this only grabs the top one. Im trying to grab all instances of the href.

Link to the page im scraping:
https://www.domain.com.au/sale/toowoomba-qld-4350/?bedrooms=3-any&price=0-600000&excludeunderoffer=1

element = driver.find_element(By.XPATH, "//div[@class='slick-slide slick-active slick-current']")
links = element.find_elements(By.CSS_SELECTOR, "a[href*='https://www.domain.com.au']")
urls = []
for link in links:
    urls.append(link.get_attribute("href"))
    print(urls)

2

Answers


  1. You should first get all matching elements and then get their href

    elements = driver.find_elements(By.XPATH, "//*[@class='slick-slide slick-active slick-current']")
    urls = []
    for element in elements:
        urls.append(elements.get_attribute("href"))
        print(urls)
    

    But,this may match elements that do not have href attributes.You may need to add some conditions or exception handling.

    Plus,I see that in your example the href is in the a tag, please write your XPATH

    Login or Signup to reply.
  2. # Parent element
    element = driver.find_element(By.XPATH, "//div[@class='slick-slide slick-active slick-current']")
    
    # Anchor tags
    links = element .find_elements(By.CSS_SELECTOR, "a[href*='https://www.domain.com.au']")
    
    # All urls 
    urls = [l.get_attribute('href') for l in links]
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search