skip to Main Content

How can Beautiful Soup retrieve an object with multiple attributes?

I am using the find function to retrieve this object: <h1 data-testid=“organization-cover-title” class=“sc-1wcv2gl-2 kpUYKd”>Company</h1> With this code below, it returns a NoneType company = soup.find(‘h1’, attrs = {‘data-testid’:‘organization-cover-title’, ‘class’:‘sc-1wcv2gl-2 kpUYKd’}).string It also does not return a string with one or…

VIEW QUESTION

How can I web scrape specific details from an HTML tag?

I'm trying to scrape specific details as a list from a page using BeautifulSoup in python. <p class="collapse text in" id="list_2"> <big>•</big> &nbsp;car <br> <big>•</big> &nbsp;bike&nbsp; <br> <span id="list_hidden_2" class="inline_hidden collapse in" aria-expanded="true"> <big>•</big> &nbsp;bus <br> <big>•</big> &nbsp;train <br><br> </span>…

VIEW QUESTION

Remove all text from a html node using regex

Is it possible to remove all text from HTML nodes with a regex? This very simple case seems to work just fine: import htmlmin html = """ <li class="menu-item"> <p class="menu-item__heading">Totopos</p> <p>Chips and molcajete salsa</p> <p class="menu-item__details menu-item__details--price"> <strong> <span…

VIEW QUESTION

BeautifulSoup Cannot Find Tag img – Html

Im trying to scrape the link of image from url https://www.eaton.com/us/en-us/skuPage.101012%2520G.html All my solutions failed; here are my attempts: print(soup.select_one('[class="module-media-gallery__image lazyload"]')["src"]) img=soup.find('img',attrs={'class':'module-media-gallery__image lazyload'}) img=soup.find('img',class_='module-media-gallery__image lazyload')

VIEW QUESTION

Cannot locate text within using Python – Html

Hi all, I am scraping questions on Amazon using the following code: url = "https://www.amazon.com/ask/questions/asin/B0000CFLYJ/1/ref=ask_ql_psf_ql_hza?isAnswered=true" r = requests.get("http://localhost:8050/render.html", params = {'url': url, 'wait': 3}) soup = BeautifulSoup(r.text, 'html.parser') questions = soup.find_all('div', {'class':'a-fixed-left-grid-col a-col-right'}) print(questions) question_list = [] for item in…

VIEW QUESTION
Back To Top
Search