I’m trying to capture the number of elements in a list using Beautiful Soup but I’m encountering an issue and getting a null array back. I’m pretty sure this used to work for me but not anymore.
I’d appreciate any help or pointers from the gurus out there as I’m sure there is a better way. I’m completely new to this and feel a bit lost.
So if we take a nested list like below with 3 elements:
<div class="row">
...
<div class="style_details">
<ul data-id="list" class="listing_details">
<li data-id="listing-index-1"></li>
<li data-id="listing-index-2"></li>
<li data-id="listing-index-3"></li>
</ul>
</div>
and a snippet of code to count the list elements using the attribute ‘class="listing_details"’
browser.get(url)
c = browser.page_source
soup = BeautifulSoup(c, "html.parser")
dom = etree.HTML(str(soup))
data = soup.findAll('li',attrs={'class':'listing_details'})
links = len(data)
return links
Is the class being nested in an unordered list causing the issue? Any ideas how to overcome this or a better way to count items on the list?
2
Answers
It seems you are using both BS and lxml. In either case, you should be counting the number of the children of
<ul data-id="list" class="listing_details">
.So with BS, it should be (using css selectors):
and with lxml:
The output should be
3
in both cases.If you want to select only direct children you can use next example:
Prints:
OR: Using
bs4
API: