Ebay API - Python BeautifulSoup find all error (object has no attribute)

Josh
May 1, 2021
265 views
0 votes
2 Answers

The script below is meant to look through ebay listings on the ebay search page. The search page is just a list, so I am trying to loop through each li tag and add the content to a variable. For some reason this script doesn’t seem to want to work and I’m not sure why.

from urllib.request import urlopen
from bs4 import BeautifulSoup

# specify the url
url = "https://www.ebay.co.uk/sch/i.html?_from=R40&_nkw=funko+gamora+199&_sacat=0&LH_Sold=1&LH_Complete=1&rt=nc&LH_PrefLoc=1&_ipg=200"

# Connect to the website and return the html to the variable ‘page’
try:
    page = urlopen(url)
except:
    print("Error opening the URL")

# parse the html using beautiful soup and store in variable `soup`
soup = BeautifulSoup(page, 'html.parser')

# Take out the <div> of name and get its value
content = soup.find('ul', {"class": "srp-results srp-list clearfix"})

#print(content)

article = ''
for i in content.findAll('li'):
    article = article + ' ' +  i.text
print(article)

# Saving the scraped text
with open('scraped_text.txt', 'w') as file:
    file.write(article)

Can anyone see where I’m going wrong?

Answers

- NotSiriusA
- May 1, 2021 at 9:05 pm
- 0 votes
0
This is what the response looks like:
```
print(soup.text)
```
Security measureSkip to main content Please verify yourself to continueerror To keep eBay a safe place to buy and sell, we will occasionally ask you to verify yourself. This helps us to block unauthorised users from entering our site.Please verify yourselfIf you’re having difficulties with the rendering of images on the above verification page, eBay suggests using the latest version of your browser or an alternate browser listed in here Additional site navigationAbout eBayAnnouncementsCommunitySafety CentreResolution CentreSeller CentreVeRO: Protecting Intellectual PropertyPoliciesHelp & ContactSite MapCopyright © 1995-2021 eBay Inc. All Rights Reserved. User Agreement, Privacy, Cookies and AdChoiceNorton Secured – powered by Verisign

It’s an error on ebay-end, your code looks fine at first glance. Also, note that webscraping is a grey area and some companies do not allow it. You might need to bypass security measures.

Also, you should comment your code in such way that tells the reader WHY your code does what it does, not what it does. You don’t have to comment things like "soup = BeautifulSoup(page, ‘html.parser’)"

Edit: I forgot to mention, error appears, because
```
content = soup.find('ul', {"class": "srp-results srp-list clearfix"})
```
found no results.
Login or Signup to reply.

Most likely you get a CAPTCHA or IP rate limit. Ways to avoid being blocked.

If you need to extract all results from all pages using pagination, the solution to this would be to use an non-token pagination and test for something (button, element) that will result in an exit:

if soup.select_one(".pagination__next"):   # checking for 'next page' button
    params['_pgn'] += 1                    # if there is a button, it will go to the next page
else:                                      # otherwise, the loop exits
    break

You can also add a condition for exiting the loop by the number of retrieved pages by adding a limit:

limit = 5                     # page limit

# other code

if params['_pgn'] == limit:   # if the page number is equal to the specified limit, the loop is terminated
    break

Code example with pagination in the online IDE.

from bs4 import BeautifulSoup
import requests, json, lxml

# https://requests.readthedocs.io/en/latest/user/quickstart/#custom-headers
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36",
}
   
params = {
    "_nkw": "iphone_14",    # search query example
    "LH_Sold": "1",         # shows sold items
    "_pgn": 1               # page number
}

data = []
limit = 5                 # page limit (if needed)
while True:
    page = requests.get("https://www.ebay.co.uk/sch/i.html", params=params, headers=headers, timeout=30)
    soup = BeautifulSoup(page.text, "lxml")
    
    print(f"Extracting page: {params['_pgn']}")

    print("-" * 10)
    
    for products in soup.select(".s-item__info"):
        title = products.select_one(".s-item__title span").text
        price = products.select_one(".s-item__price").text
        
        data.append({
          "title" : title,
          "price" : price
        })

    if params['_pgn'] == limit:
       break
    if soup.select_one(".pagination__next"):
        params['_pgn'] += 1
    else:
        break

print(json.dumps(data, indent=2, ensure_ascii=False))

Example output:

[
  {
    "title": "Case For iPhone 11 Pro Max 14Pro 8 7  SE 2022  Shockproof Silicone Cover colours",
    "price": "£3.99"
  },
  {
    "title": "Ring Holder Magnetic Shockproof Case Cover For iPhone  14Pro Max 11 XR  XS 12 13",
    "price": "£5.99 to £6.99"
  },
  {
    "title": "Apple iPhone 14 - 128GB - Space Black (Unlocked) A2890 (GSM)",
    "price": "£641.95"
  },
  other results ...
]

As an alternative, you can use Ebay Organic Results API from SerpApi. It’s a paid API with a free plan that handles blocks and parsing on their backend.

Example code with pagination:

from serpapi import EbaySearch
import json

params = {
    "api_key": "...",                 # serpapi key, https://serpapi.com/manage-api-key   
    "engine": "ebay",                 # search engine
    "ebay_domain": "ebay.co.uk",      # ebay domain
    "_nkw": "iphone_14",              # search query
    "LH_Sold": "1",                   # shows sold items
    "_pgn": 1                         # page number
}

search = EbaySearch(params)           # where data extraction happens

limit = 5
page_num = 0
data = []

while True:
    results = search.get_dict()     # JSON -> Python dict

    if "error" in results:
        print(results["error"])
        break
    
    for organic_result in results.get("organic_results", []):
        title = organic_result.get("title")
        price = organic_result.get("price")

        data.append({
          "title" : title,
          "price" : price
        })
                    
    page_num += 1
    print(page_num)

    if params['_pgn'] == limit:
       break
    if "next" in results.get("pagination", {}):
        params['_pgn'] += 1
    else:
        break

print(json.dumps(data, indent=2, ensure_ascii=False))

Output:

[
  {
    "title": "Apple iPhone 14 Plus Midnight - 512GB - Unlocked - MINT CONDITION",
    "price": {
      "raw": "£749.99",
      "extracted": 749.99
    }
  },
  {
    "title": "New listingApple iPhone 14 Plus (PRODUCT)RED - 128GB (Unlocked)",
    "price": {
      "raw": "£750.00",
      "extracted": 750.0
    }
  other results ...
]

There’s a 13 ways to scrape any public data from any website blog post if you want to know more about website scraping.

Please signup or login to give your own answer.

Click here to cancel reply.

Ebay API – Python BeautifulSoup find all error (object has no attribute)

Answers