skip to Main Content

I’m trying to get the wins from the ‘Overall match stats’ on this page: https://www.fctables.com/teams/sunderland-194998/?template_id=11. Everything I try it just returns ‘None’. This isn’t the only page I have tried to use but every one seems to return ‘None’. I’m not very advanced in this so any help would be appreciated.

from bs4 import BeautifulSoup
import requests

URL = "https://www.fctables.com/teams/sunderland-194998/"
response = requests.get(URL)

soup = BeautifulSoup(response.text, "html.parser")

wins = soup.find('div', class_='text-success ')
print(wins)

I need it to output the ‘6’ which is the number of wins. Preferably as an integer.

2

Answers


  1. BeautifulSoup famously is the package that lets
    you parse other people’s HTML garbage as if it were grammatically correct.
    The HTML grammar is, alas, a bit complicated.

    You got hung up on the trailing SPACE in the class name.
    Just strip it.

    >>> from pprint import pp
    >>>
    >>> pp(soup.find_all('div', class_='text-success '))
    []
    >>> pp(soup.find_all('div', class_='text-success'))
    [<div class="text-success">11</div>,
     <div class="text-success">1.83</div>,
     <div class="text-success">4</div>,
     <div class="text-success">4/6</div>,
     <div class="text-success">5/6</div>,
     <div class="text-success">2/6</div>,
     <div class="text-success">2/6</div>,
     <div class="text-success">41</div>,
     <div class="text-success">2.16</div>,
     <div class="text-success">11</div>,
     <div class="text-success">78.9%</div>,
     <div class="text-success">89.5%</div>,
     <div class="text-success">21.05%</div>,
     <div class="text-success">68.42%</div>]
    

    Steve Harvey wants to know, "Could SPACE ever be
    part of a valid class name?"
    Survey says
    "nope!", the SPACE character is specifically prohibited.

    Login or Signup to reply.
  2. You can change how you select the tags:

    import requests
    from bs4 import BeautifulSoup
    
    URL = "https://www.fctables.com/teams/sunderland-194998/?template_id=11"
    response = requests.get(URL)
    
    soup = BeautifulSoup(response.text, "html.parser")
    
    stats = {}
    for li in soup.select("h3:-soup-contains('Overall matches stats') + div li"):
        stats[li.p.text] = li.div.text
    
    print(stats["Wins"])
    

    Prints:

    6
    

    The stats is a dictionary that contains:

    {
        "Matches": "11",
        "Goals": "20",
        "per game": "1.82",
        "Wins": "6",
        "Draws": "1",
        "Losses": "4",
        "Over 2.5": "72.7%",
        "Over 1.5": "81.8%",
        "CS": "36.36%",
        "BTTS": "45.45%",
    }
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search