skip to Main Content

Im facing a problem in Python selenium,
I would like to print on my code the following data, an email address: [email protected]

I just need a hint, that’s all…

HTML:

<section tabindex="-1" class="pv-profile-section pv-contact-info artdeco-container-card">
<!---->
<h2 class="text-body-large-open mb4">
   Información de contacto
</h2>
<div class="pv-profile-section__section-info section-info" tabindex="-1">
<section class="pv-contact-info__contact-type ci-vanity-url">
   <li-icon aria-hidden="true" type="linkedin-bug" class="pv-contact-info__contact-icon" size="medium">
      <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" data-supported-dps="24x24" fill="currentColor" class="mercado-match" width="24" height="24" focusable="false">
         <path d="M20.5 2h-17A1.5 1.5 0 002 3.5v17A1.5 1.5 0 003.5 22h17a1.5 1.5 0 001.5-1.5v-17A1.5 1.5 0 0020.5 2zM8 19H5v-9h3zM6.5 8.25A1.75 1.75 0 118.3 6.5a1.78 1.78 0 01-1.8 1.75zM19 19h-3v-4.74c0-1.42-.6-1.93-1.38-1.93A1.74 1.74 0 0013 14.19a.66.66 0 000 .14V19h-3v-9h2.9v1.3a3.11 3.11 0 012.7-1.4c1.55 0 3.36.86 3.36 3.66z"></path>
      </svg>
   </li-icon>
   <h3 class="pv-contact-info__header t-16 t-black t-bold">
      Perfil de Marco
   </h3>
   <div class="pv-contact-info__ci-container t-14">
      <a href="https://www.mywebsite.com" class="pv-contact-info__contact-link link-without-visited-state t-14">
      mywebsite.com/italia/caio_plinio
      </a>
   </div>
</section>
<section class="pv-contact-info__contact-type ci-email">
<li-icon aria-hidden="true" type="envelope" class="pv-contact-info__contact-icon">
   <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" data-supported-dps="24x24" fill="currentColor" class="mercado-match" width="24" height="24" focusable="false">
      <path d="M2 4v13a3 3 0 003 3h14a3 3 0 003-3V4zm18 2v1.47l-8 5.33-8-5.33V6zm-1 12H5a1 1 0 01-1-1V8.67L12 14l8-5.33V17a1 1 0 01-1 1z"></path>
   </svg>
</li-icon>
<h3 class="pv-contact-info__header t-16 t-black t-bold">
   Email
</h3>
<div class="pv-contact-info__ci-container t-14">
   <a href="mailto:  [email protected]" class="pv-contact-info__contact-link link-without-visited-state t-14" target="_blank" rel="noopener noreferrer">
   [email protected]
   </a>
</div>

This is what I have wrote does, part of the code.

contact_info = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, 'pv-contact-info__ci-container')))

# Find the email element within the contact info section
email_element = contact_info.find_element(By.CSS_SELECTOR, 'a.pv-contact-info__contact-link')


email = email_element.get_attribute('innerHTML')

print(email)

Output: mywebsite.com/italia/caio_plinio

I see the problem is I am using more than more time.

I see the solution is "differenciate" by section class but how can I write it? (I just need a hint)

enter image description here

My desired ouput:

[email protected]

2

Answers


  1. Using XPath locator, you can use the below code:

    email_element = driver.find_element(By.XPATH, "//h3[contains(text(),'Email')]//following::a[1]")
    print (email_element.text)
    

    Result:

    [email protected]
    

    XPath expression explanation: Below XPath expression will locate the first <a> node which is located immediately after the <h3> node containing text Email.

    //h3[contains(text(),'Email')]//following::a[1]
    
    Login or Signup to reply.
  2. To extract the text [email protected] instead of presence_of_element_located() you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

    • Using XPATH, following-sibling and text attribute:

      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h3[contains(., 'Email') and contains(@class, 'pv-contact-info__header')]//following-sibling::div[1]/a"))).text)
      
    • Using XPATH, following and get_attribute("innerHTML"):

      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h3[contains(., 'Email') and contains(@class, 'pv-contact-info__header')]//following::div[1]/a"))).get_attribute("innerHTML"))
      
    • Note : You have to add the following imports :

      from selenium.webdriver.support.ui import WebDriverWait
      from selenium.webdriver.common.by import By
      from selenium.webdriver.support import expected_conditions as EC
      

    You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium – Python


    References

    Link to useful documentation:

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search