skip to Main Content

I am working on a project where I am trying to make a list of different items on a website by scraping the code using Python(BeautifulSoup to be specific). Long story short I am able to print the items I am looking for but when I print I get all the tags included instead of just the text. Which led me to try to use the get_text and text functions but I get an error message stating the following: "ResultSet object has no attribute ‘text’. You’re probably treating a list of elements like a single element."

This is what I tried to use to only get text but I get the Attribute error as previously mentioned:

import requests
from bs4 import BeautifulSoup


url = "https:mywebsite"

response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

rows = soup.find('div', {'class': "wrench"})



for row in rows:


        color = row.find_all_next('div', {'class': "wrenchColor"}).get_text

        print(color)
        break

For reference, the following is part of the html code:

<div class = "wrench"
   <div class = "wrenchColor"
etc...

In which if I didn’t use the get.text function, I would get no error and the output would be: " <div class = "wrenchColor">Color Blue</div,<div class = "wrenchColor">Color Red</div> "

In this case, I would like to just get "Color Blue" and "Color Red". Any help would be greatly appreciated.

2

Answers


  1. For each row returned from requests.find_all() you just need to access its text attribute.

    from bs4 import BeautifulSoup as BS
    import requests
    
    try:
        import lxml
        PARSER = 'lxml'
    except ModuleNotFoundError:
        PARSER = 'html.parser'
    
    URL = 'your URL goes here'
    with requests.get(URL) as response:
        response.raise_for_status()
        soup = BS(response.text, PARSER)
        for row in soup.find_all('div', {'class': 'wrenchColor'}):
            print(row.text)
    
    Login or Signup to reply.
  2. If you are sure that all the tags inside the wrench class are what you need and their colors, you can use the following method:

    rows = soup.find('div', {'class': "wrench"})
    for color in rows.stripped_strings:
        print(color)
    

    Which gives the following result:

    Color Blue
    Color Red
    Color Green
    Color White
    Color Black
    

    If not and you want to have a shorter code, you can use the following method:

    rows = map(lambda x: x.string, soup.select('div.wrench > div.wrenchColor'))
    print(list(rows))
    

    Which gives the following result:

    ['Color Blue', 'Color Red', 'Color Green', 'Color White', 'Color Black']
    

    The problem you’re facing is because the find_all_next() method returns a set of elements, and to access the content inside each tag, you need a separate for loop on the results of this method, and call get_text() on those elements.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search