Is there any solution to get a link from the HTML, which has a tag and a div tag?
html1:
<a href="https://u50.ct.sendgrid.net/ls" target="_blank">
<div class="subtitle">
Service request #2226754
</div></a>
html2:
<div class="subtitle">
Service request <a href="https://u5024.ct.sendgrid.net/ls" style="color:#5A88AA; text-decoration:underline;" target="_blank">#2604467</a>
</div>
code:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
scores_string = soup.find("div",text=re.compile(re.compile('Service request',re.IGNORECASE)))
print(scores_string)
ahref = scores_string.find_parent("a")
print(ahref["href"])
Required solutions:
1)https://u50.ct.sendgrid.net/ls
2)https://u5024.ct.sendgrid.net/ls
I have two HTMLs. Both format are different. I need to take URL from both HTML. Is there any solution using beautifulsoup?
2
Answers
div = soup.find('div', class_='subtitle')
div.find('a')
link = a_tag['href']
If the subtitle div is inside the a tag, just look for the wrapping div instead. You might also want to use error handling in these cases for the code above.
Implementing a custom tag filter. My solution doesn’t need an extra import for _regex_s but for more complex cases it may be required or suggested.