Is there a way to specifically web scrape and get the data of heights that is not listed in text? - Html

JustAskin
February 18, 2023
245 views
0 votes
2 Answers

I’m web scraping a bunch of heights for listed athletes. I have written the code to get the heights but after inspecting element, I noticed that under text the height is written in feet, but in "data-sort" that height is listed in inches. Both of these are in the td tag in class "heights". However when I use "get_text()" or .text to remove the html elements it only prints out the height in feet and removes the hidden height in inches. Is there a way I can get the height listed in inches because that will make it easier to the do math.

Here is an example of what I’m web scraping, I want remove everything and only get the height in inches which will be [79,85,74… in this case.

<td class="height" data-sort="79">6-7</td>
<td class="height" data-sort="85">7-1</td>
<td class="height" data-sort="74">6-2</td>

#This is my code

from bs4 import BeautifulSoup
import requests 

urls=['https://goduke.com/sports/mens-basketball/roster']

ListData=[]
for x in range(len(urls)):
    page=requests.get(urls[x]).text
    pagesoup=BeautifulSoup(page,'html.parser')
    h=pagesoup.find_all('td', class_="height")
    ListData.append(h)
NewList=[]
for b in range(len(ListData)):
    new=[]
    for x in ListData[b]:
        print(x.text)

Answers

- SoheilBabadi
- February 18, 2023 at 6:06 am
- 0 votes
0
If you use css selector you can simply pass the first class name.

from scrapy.selector import Selector

Login or Signup to reply.

from bs4 import BeautifulSoup
import requests 

urls=['https://goduke.com/sports/mens-basketball/roster']

ListData=[]

for url in urls:
    page=requests.get(url).text
    pagesoup=BeautifulSoup(page,'html.parser')
    tds = pagesoup.select('td.height[data-sort]')
    for td in tds:
        ListData.append(td.attrs['data-sort'])
print(ListData)

output

['79', '85', '74', '74', '77', '77', '78', '77', '82', '85', '80', '84', '77', '84', '68']

Please signup or login to give your own answer.

Click here to cancel reply.

Is there a way to specifically web scrape and get the data of heights that is not listed in text? – Html

Answers