Is there a way for me to use BeautifulSoup
to get the text of tags that contain more than one word?
For example if I had HTML:
<div>
<div>
<a>hello there</a>
<a>hi</a>
</div>
<a>what's up</a>
<a>stackoverflow</a>
</div>
…I just want to get:
hello there what's up
2
Answers
You can definitely use BeautifulSoup to extract the text from HTML tags that contain more than one word. In your example, you want to extract the text from tags that have multi-word content. Here’s how you can achieve that using BeautifulSoup in Python.
If you like to use
BeautifulSoup
you could also usestripped_strings
and iterate its result, while checking if there is a whitespace:Alternatively, you can check each tag individually with
.get_text()
, but I would recommend stripping the results.get_text(strip=True)
before checking for a whitespace.Example