skip to Main Content

I have this web scraper program in python, but it prints both tennis players Felix and Alexander. I would like to only print the first available tennis player as a separate item and exclude all the ones after it, so what do I need change in the code to do this?

To note, I did this through Visual Studio 2022 and applied the program to use Microsoft Edge web browser.

import requests
from bs4 import BeautifulSoup

response = requests.get("https://www.betexplorer.com/tennis/atp-singles/basel/auger-aliassime-felix-bublik-alexander/U5HIueTc/")
webpage = response.content

soup = BeautifulSoup(webpage, "html.parser")

for h2 in soup.find_all('h2'):
    values = [data for data in h2.find_all('a')]
    for value in values:
        print(value.text.replace(" ","_"))
    print()

2

Answers


  1. Instead of the loop, just do

    print(soup.h2.text.strip())
    
    Login or Signup to reply.
  2. Instead of looping through each tag individually you can use the select() function to find that specific tag and print the first one.

    import requests
    from bs4 import BeautifulSoup
    
    response = requests.get("https://www.betexplorer.com/tennis/atp-singles/basel/auger-aliassime-felix-bublik-alexander/U5HIueTc/")
    webpage = response.content
    
    soup = BeautifulSoup(webpage, "html.parser")
    
    print(soup.select('h2 a')[0].text.replace(' ','_'))
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search