skip to Main Content

I’m working on an eBay Scraper, and I’m having some trouble with a simple "AttributeError: ‘NoneType’ object has no attribute ‘text’"…

Here is my code

url = 'https://www.ebay.com/sch/i.html?_from=R40&_nkw=2017+patrick+mahomes+psa+10+auto&_sacat=0&rt=nc&LH_Sold=1&LH_Complete=1&_ipg=100'

def get_data(url):
    r = requests.get(url)
    soup = bs(r.text, 'html.parser')
    return soup

def parse(soup):
    productslist = []
    results = soup.find_all('div', {'class': 's-item__info clearfix'})
    for item in results:
        product = {
            'title': item.find('h3', class_='s-item__title s-item__title--has-tags').text,
            'soldprice': float(item.find('span', class_='s-item__price').text.replace('$', '').replace(',','').strip()),
            'solddate': item.find('span', class_='s-item__title--tagblock__COMPLETED').find('span', class_='POSITIVE').text,
            'bids': item.find('span', class_='s-item__bids s-item__bidCount').text,
            'link': item.find('a', class_='s-item__link')['href'],
            }
        productslist.append(product)
    return productslist

def output(productslist):
    productsdf = pd.DataFrame(productslist)
    productsdf.to_csv('2017_Patrick_Mahomes_Rookies.csv', index=False)
    print('Saved to CSV')
    return

soup = get_data(url)
productslist = parse(soup)
output(productslist)
print(parse(soup))

I’m having trouble with the line.

'bids': item.find('span', class_='s-item__bids s-item__bidCount').text,

It is returning this error.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-48-0c9ceb760d75> in <module>
     27 
     28 soup = get_data(url)
---> 29 productslist = parse(soup)
     30 output(productslist)
     31 print(parse(soup))

<ipython-input-48-0c9ceb760d75> in parse(soup)
     14             'soldprice': float(item.find('span', class_='s-item__price').text.replace('$', '').replace(',','').strip()),
     15             'solddate': item.find('span', class_='s-item__title--tagblock__COMPLETED').find('span', class_='POSITIVE').text,
---> 16             'bids': item.find('span', class_='s-item__bids s-item__bidCount').text,
     17             'link': item.find('a', class_='s-item__link')['href'],
     18             }

AttributeError: 'NoneType' object has no attribute 'text'

When I remove .text, the code works and I get a csv file like this.

enter image description here

I want to keep all fields in the vertical "bids" column, strip away the span class html in most fields, and fill in the blank fields with "N/A". When I run ‘try and except’ statements, it deletes all the fields without "bids" (e.g. it keeps only fields with ‘5 bids and ’89 bids’, and deletes all others).

Still a beginner, so I apologize for the poor explanation.

2

Answers


  1. You should check if tag s-item__bids s-item__bidCount exists like this

    bids_tag = item.find('span', class_='s-item__bids s-item__bidCount')
    if bids_tag:
        bids = bids_tag.text
    else:
        bids = ''
    

    And then

    'bids': item.find('span', class_='s-item__bids s-item__bidCount').text,
    

    Also you can check my library https://github.com/eugen1j/beautifulsoup4-helpers

    Here code using this library

    'bids': select_text_one(item, 'span.s-item__bids s-item__bidCount')
    
    Login or Signup to reply.
  2. Do this before Product.

    try:
        bids = item.find('span', class_='s-item__bids s-item__bidCount').text
    except:
        bids = ''
    

    Replace bids = ” to bids = ‘N/A’ if you want to write N/A when bids are not available.

    Update bids in Product,

    'bids': bids,
    

    OR Here is full code:

    url = 'https://www.ebay.com/sch/i.html?_from=R40&_nkw=2017+patrick+mahomes+psa+10+auto&_sacat=0&rt=nc&LH_Sold=1&LH_Complete=1&_ipg=100'
    
    def get_data(url):
        r = requests.get(url)
        soup = bs(r.text, 'html.parser')
        return soup
    
    def parse(soup):
        productslist = []
        results = soup.find_all('div', {'class': 's-item__info clearfix'})
        for item in results:
            
            try:
                bids = item.find('span', class_='s-item__bids s-item__bidCount').text
            except:
                bids = ''
    
            product = {
                'title': item.find('h3', class_='s-item__title s-item__title--has-tags').text,
                'soldprice': float(item.find('span', class_='s-item__price').text.replace('$', '').replace(',','').strip()),
                'solddate': item.find('span', class_='s-item__title--tagblock__COMPLETED').find('span', class_='POSITIVE').text,
                'bids': bids,
                'link': item.find('a', class_='s-item__link')['href'],
                }
            productslist.append(product)
        return productslist
    
    def output(productslist):
        productsdf = pd.DataFrame(productslist)
        productsdf.to_csv('2017_Patrick_Mahomes_Rookies.csv', index=False)
        print('Saved to CSV')
        return
    
    soup = get_data(url)
    productslist = parse(soup)
    output(productslist)
    print(parse(soup))
    

    Working for me, Output:
    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search