skip to Main Content

I have a problem with text scraping problem. The website that I am scraping has 3 same line of html but 3 different informations. Like this

'<div class="section">...</div>'

'<div class="section">...</div>'

'<div class="section">...</div>'

So I got the texts from first <div class="section">...</div> using this code 'soup.find("div", class_="section").text.strip()'

But cant scrape texts from 2nd and 3rd <div class="section">...</div>. Help me pls.

/Ps: New to web scraping and also english is my second language if I were not clear on writing./

2

Answers


  1. divs = soup.find("div", class_="section")

    this part selects a list of divs, even if there’s only one, and you can then select each one individually. but if you add .text it will instead mash all divs text into one blob. so just keep the list and access its members like so:

    div1 = divs[0].text.strip()
    div2 = divs[1].text.strip()
    div3 = divs[2].text.strip()
    
    Login or Signup to reply.
  2. You can use find_all method of soup. I will provide sample code.

    sections = soup.find_all("div", class_="section")
    for section in sections:
        print(section.text.strip())
    

    Using find will only select the first div with class="section", but using find_all will select all divs with class="section" and extract the text information inside the divs.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search