I am working on a project where I am trying to make a list of different items on a website by scraping the code using Python(BeautifulSoup to be specific). Long story short I am able to print the items I am looking for but when I print I get all the tags included instead of just the text. Which led me to try to use the get_text and text functions but I get an error message stating the following: "ResultSet object has no attribute ‘text’. You’re probably treating a list of elements like a single element."
This is what I tried to use to only get text but I get the Attribute error as previously mentioned:
import requests
from bs4 import BeautifulSoup
url = "https:mywebsite"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
rows = soup.find('div', {'class': "wrench"})
for row in rows:
color = row.find_all_next('div', {'class': "wrenchColor"}).get_text
print(color)
break
For reference, the following is part of the html code:
<div class = "wrench"
<div class = "wrenchColor"
etc...
In which if I didn’t use the get.text function, I would get no error and the output would be: " <div class = "wrenchColor">Color Blue</div,<div class = "wrenchColor">Color Red</div> "
In this case, I would like to just get "Color Blue" and "Color Red". Any help would be greatly appreciated.
2
Answers
For each row returned from requests.find_all() you just need to access its text attribute.
If you are sure that all the tags inside the wrench class are what you need and their colors, you can use the following method:
Which gives the following result:
If not and you want to have a shorter code, you can use the following method:
Which gives the following result:
The problem you’re facing is because the find_all_next() method returns a set of elements, and to access the content inside each tag, you need a separate for loop on the results of this method, and call get_text() on those elements.