skip to Main Content

Say that I have a piece of HTML code that looks like this:

<html>
    <body>
        <thspan class="sentence">He</thspan>
        <thspan class="sentence">llo</thspan>
    </body>
</html>

And I wanted to get the content of both and connect them into a string in Python Selenium.

My current code looks like this:

from selenium import webdriver
from selenium.webdriver.common.by import By

browser = webdriver.Chrome()

thspans = browser.find_elements(By.CLASS_NAME, "sentence")
context = ""
for thspan in thspans:
    context.join(thspan.text)

The code can run without any problem, but the context variable doesn’t contain anything. How can I get the content of both and connect them into a string in Python Selenium?

2

Answers


  1. Chosen as BEST ANSWER

    context += thspan.text instead of using context.join(thspan.text) just like @Rajagopalan said


  2. You were not redirecting the browser to the page you actually want to scrape the data from. And you were misusing the .join method. Here is a code that will work for you:

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    
    browser = webdriver.Chrome()
    # Put the absolute path to your html file if you are working locally, or
    # the URL of the domain you want to scrap
    browser.get('file:///your/absolute/path/to/the/html/code/index.html')
    
    thspans = browser.find_elements(By.CLASS_NAME, "sentence")
    context = ''
    print('thspans', thspans, end='nn')
    for thspan in thspans:
        context += thspan.text
    print(context)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search