How to get text when running a loop on a list using Python & HTML?

dmoney21
May 8, 2023
222 views
0 votes
2 Answers

I am working on a project where I am trying to make a list of different items on a website by scraping the code using Python(BeautifulSoup to be specific). Long story short I am able to print the items I am looking for but when I print I get all the tags included instead of just the text. Which led me to try to use the get_text and text functions but I get an error message stating the following: "ResultSet object has no attribute ‘text’. You’re probably treating a list of elements like a single element."

This is what I tried to use to only get text but I get the Attribute error as previously mentioned:

import requests
from bs4 import BeautifulSoup


url = "https:mywebsite"

response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

rows = soup.find('div', {'class': "wrench"})



for row in rows:


        color = row.find_all_next('div', {'class': "wrenchColor"}).get_text

        print(color)
        break

For reference, the following is part of the html code:

<div class = "wrench"
   <div class = "wrenchColor"
etc...

In which if I didn’t use the get.text function, I would get no error and the output would be: " <div class = "wrenchColor">Color Blue</div,<div class = "wrenchColor">Color Red</div> "

In this case, I would like to just get "Color Blue" and "Color Red". Any help would be greatly appreciated.

Answers

For each row returned from requests.find_all() you just need to access its text attribute.

from bs4 import BeautifulSoup as BS
import requests

try:
    import lxml
    PARSER = 'lxml'
except ModuleNotFoundError:
    PARSER = 'html.parser'

URL = 'your URL goes here'
with requests.get(URL) as response:
    response.raise_for_status()
    soup = BS(response.text, PARSER)
    for row in soup.find_all('div', {'class': 'wrenchColor'}):
        print(row.text)

- Shahin
- May 8, 2023 at 9:09 am
- 0 votes
0
If you are sure that all the tags inside the wrench class are what you need and their colors, you can use the following method:
```
rows = soup.find('div', {'class': "wrench"})
for color in rows.stripped_strings:
    print(color)
```
Which gives the following result:
```
Color Blue
Color Red
Color Green
Color White
Color Black
```
If not and you want to have a shorter code, you can use the following method:
```
rows = map(lambda x: x.string, soup.select('div.wrench > div.wrenchColor'))
print(list(rows))
```
Which gives the following result:
```
['Color Blue', 'Color Red', 'Color Green', 'Color White', 'Color Black']
```
The problem you’re facing is because the find_all_next() method returns a set of elements, and to access the content inside each tag, you need a separate for loop on the results of this method, and call get_text() on those elements.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.