skip to Main Content

So I am trying to run a BS4 ebay scraper based on a video on youtube by some russian guy. I quite new to this style of coding a scraper, ive only used selenium so far, but wanted to try something new and faster which could be deployed once on a server so i would be able to get data i scrape into my phone for example when im not at home. My goal is to print out text from h3 objects i parse but the output prints this: ("<h 3 class="s-item__title">Samsung NP9004D 900X Intel i5-3317U / 4GB RAM / 13,3 Zoll</h 3>") instead of this (Samsung NP9004D 900X Intel i5-3317U / 4GB RAM / 13,3 Zoll)
Could anyone please explain what im doing wrong? Thank you very much!
PS: If anyone knows a better way of creating an ebay specific scraper, help would be appreciated! Like automating stuff or creating a custom API…

import requests
from bs4 import BeautifulSoup


def get_page(url):
    response = requests.get(url)

    if not response.ok:
        print("Server responded:", response.status_code)
    else:
        soup = BeautifulSoup(response.text, "lxml")
    return soup


def get_detail_data(soup):
    listing = soup.find("div", {"class": "s-item__wrapper clearfix"})
    name = soup.find_all("h3", {"class": "s-item__title"}, text=True)
    print(name)






def main():
    url = "https://www.ebay.de/b/Laptops-Notebooks/175672/bn_1618754?LH_ItemCondition=7000&mag=1&rt=nc&_sop=1"
    get_page(url)
    get_detail_data(get_page(url))


if __name__ == "__main__": 
    main()

2

Answers


  1. find_all returns list of elements based on class you have to loop over it extract text for that .get_text() or .text method can be used

    data=soup.find_all("h3", class_="s-item__title")
    products=[i.get_text(strip=True) for i in data]
    
    Login or Signup to reply.
  2. Try this:

    def get_detail_data(soup):
        listing = soup.find("div", {"class": "s-item__wrapper clearfix"})
        names = soup.find_all("h3", {"class": "s-item__title"})
        for name in names:
            if chr(9650) in str(name):
                print(name.text.strip().replace(chr(9650), ''))
            else:
                print(name.text.strip())
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search