Html - Always returns 'None' when trying to get an element from web page

Xdvanced
October 7, 2023
228 views
1 vote
2 Answers

I’m trying to get the wins from the ‘Overall match stats’ on this page: https://www.fctables.com/teams/sunderland-194998/?template_id=11. Everything I try it just returns ‘None’. This isn’t the only page I have tried to use but every one seems to return ‘None’. I’m not very advanced in this so any help would be appreciated.

from bs4 import BeautifulSoup
import requests

URL = "https://www.fctables.com/teams/sunderland-194998/"
response = requests.get(URL)

soup = BeautifulSoup(response.text, "html.parser")

wins = soup.find('div', class_='text-success ')
print(wins)

I need it to output the ‘6’ which is the number of wins. Preferably as an integer.

Answers

BeautifulSoup famously is the package that lets
you parse other people’s HTML garbage as if it were grammatically correct.
The HTML grammar is, alas, a bit complicated.

You got hung up on the trailing SPACE in the class name.
Just strip it.

>>> from pprint import pp
>>>
>>> pp(soup.find_all('div', class_='text-success '))
[]
>>> pp(soup.find_all('div', class_='text-success'))
[<div class="text-success">11</div>,
 <div class="text-success">1.83</div>,
 <div class="text-success">4</div>,
 <div class="text-success">4/6</div>,
 <div class="text-success">5/6</div>,
 <div class="text-success">2/6</div>,
 <div class="text-success">2/6</div>,
 <div class="text-success">41</div>,
 <div class="text-success">2.16</div>,
 <div class="text-success">11</div>,
 <div class="text-success">78.9%</div>,
 <div class="text-success">89.5%</div>,
 <div class="text-success">21.05%</div>,
 <div class="text-success">68.42%</div>]

Steve Harvey wants to know, "Could SPACE ever be
part of a valid class name?"
Survey says
"nope!", the SPACE character is specifically prohibited.

You can change how you select the tags:

import requests
from bs4 import BeautifulSoup

URL = "https://www.fctables.com/teams/sunderland-194998/?template_id=11"
response = requests.get(URL)

soup = BeautifulSoup(response.text, "html.parser")

stats = {}
for li in soup.select("h3:-soup-contains('Overall matches stats') + div li"):
    stats[li.p.text] = li.div.text

print(stats["Wins"])

Prints:

The stats is a dictionary that contains:

{
    "Matches": "11",
    "Goals": "20",
    "per game": "1.82",
    "Wins": "6",
    "Draws": "1",
    "Losses": "4",
    "Over 2.5": "72.7%",
    "Over 1.5": "81.8%",
    "CS": "36.36%",
    "BTTS": "45.45%",
}

Please signup or login to give your own answer.

Click here to cancel reply.

Html – Always returns 'None' when trying to get an element from web page

Answers