skip to Main Content

There is a website with this html-code and only one itemprop="brand":

<dd itemprop="brand" class="attributeValue-2574930263">
    <a class="link-3970392289 link__default-1151936189 attributeLink-2414863004" href="/b- 
    cars-trucks/ottawa/hyundai/c174l1700185a54">Hyundai</a>
</dd>

I can not use classes, because there are many of theme.
I am trying to get the text (Hyundai) with this code:

def get_car_datas(url):
    req = requests.get(url, headers=headers)
    src = req.text
    soup = BeautifulSoup(src, "lxml")
    car_brand = soup.find(itemprop_="brand").next_elements
    print(car_brand)

I get this message:
AttributeError: ‘NoneType’ object has no attribute ‘next_elements’

How can I get the text Hyundai?

4

Answers


  1. Chosen as BEST ANSWER

    I am trying with only one page. Some times the respond is the text I am searching for (Hyundai, BMW, Kia...), some times it is responding "Brand not found :(". Here is the full code:

    import re
    import requests
    from bs4 import BeautifulSoup
    
    headers = {
        "Accept": "*/*",
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0Safari/537.36"
    }
    
    def get_car_datas(url):
        req = requests.get(url, headers=headers)
        src = req.text
        soup = BeautifulSoup(src, "lxml")
        car_brand = soup.find(attrs={"itemprop": "brand"})
        if car_brand:
            car_brand = car_brand.find(class_="link-3970392289").text
            print(car_brand)
        else:
            print("Brand not found :(")
    
    page_number = 0
    for page_number in range(1, 2):
        url = f"https://www.kijiji.ca/b-cars-trucks/ontario/suv/2017__/page-{page_number}/k0c174l9004a68?price=__25000&kilometers=__80000"
        page_number += 1
        req = requests.get(url, headers=headers)
        src = req.text
        soup = BeautifulSoup(src, "lxml")
        page_number_text = soup.find(class_="pagination").text
        if "Next" in page_number_text:
            get_car_links = soup.find_all(class_="info-container")
            for i in get_car_links:
                if not "https" in i.find('a', class_='title').get('href'):
                    car_links = i.find("a", class_="title")
                    car_url = ('https://www.kijiji.ca' + car_links.get("href"))
                    get_car_datas(car_url)
                    print(car_url)
            print(f"Page {page_number-1} done!")
            continue
        else:
            get_car_links = soup.find_all(class_="info-container")
            for i in get_car_links:
                car_links = i.find("a", class_="title")
                car_url = ('https://www.kijiji.ca' + car_links.get("href"))
            print("Last page done!")
            break
    
    

  2. You have a typo in the find() method. Instead of using itemprop_="brand", you should use itemprop="brand".

    Login or Signup to reply.
  3. this error means that Beautifulsoup is not able to find the given element.

    Try this:

    def get_car_datas(url):
       req = requests.get(url, headers=headers)
       src = req.text
       soup = BeautifulSoup(src, "lxml")
       try:
          car_brand = soup.find(itemprop_="brand").next_elements
          print(car_brand)
       except:
          car_brand = "<div> No elements found </div>"
          print(car_brand)
    
    Login or Signup to reply.
  4. This error is caused because soup.find(itemprop_="brand") has not found any element with itemprop_="brand", therefore returning None. Since None does not have an attribute called next_elements, this error is thrown.

    First of all, you have a type, as mentioned by Teroaz. You are searching for itemprop, not itemprop_. So you could do this (but you have another option below):

    def get_car_datas(url):
        req = requests.get(url, headers=headers)
        src = req.text
        soup = BeautifulSoup(src, "lxml")
        car_brand = soup.find(itemprop="brand")
    
        if car_brand:
            car_brand = car_brand.next_elements
            print(car_brand)
        else:
            print("Brand not found :(")
    

    This might be more explicit on what you are searching for, so you could do this:

    def get_car_datas(url):
        req = requests.get(url, headers=headers)
        src = req.text
        soup = BeautifulSoup(src, "lxml")
        car_brand = soup.find(attrs={"itemprop": "brand"})
    
        if car_brand:
            car_brand = car_brand.next_elements
            print(car_brand)
        else:
            print("Brand not found :(")
    

    It may work better. Or, if you’d like, you could stick to the option above.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search