I want to scrap data from a webpage. Here’s the code I have. It is supposed to get all the authors, but it only gets a first one (‘Simon Butler’).
Array.from(document.querySelectorAll('#author-group'))
.map(e =>
e.querySelector('[class="button-link workspace-trigger button-link-primary"]'))
.map(e =>
e.querySelector('[class="button-link-text"]'))
.map(e =>
e.querySelector('[class="react-xocs-alternative-link"]'))
.map(e =>
e.querySelector('[class="given-name"]').textContent + ' '
+ e.querySelector('[class="text surname"]').textContent)
.join(', ')
As I see it, the error is from using querySelector
as it gets the first element. However, when I use querySelectorAll
I get the following error: e.querySelectorAll is not a function
.
I want to scrap data from https://www.sciencedirect.com/science/article/pii/S0164121219302262.
I didn’t give any HTML code as the source HTML is really huge when it comes to a portion of authors information. I’m not familiar enough with HTML nor JS to give a minimal sample of HTML code.
2
Answers
This creates an array with one element in it.
The code you provided used
querySelector
which only returned one item (which is what you said you weren’t looking for) but you said you tried withquerySelectorAll
.Since the previous step returned an array with an element in it,
e
is an element.Elements have
querySelectorAll
so this is fine.However, now you are returning a NodeList, not an Element.
Now
e
is a NodeList. It isn’t an Element.NodeLists don’t have
querySelector
orquerySelectorAll
methods.You need to loop over the NodeList (perhaps with a
map
) and deal with each element one by one.Probably what you should be doing is calling
querySelectorAll
once and using descendant combinators to describe the elements containing each author in a single query.Then you would be able to:
There’s only one
#author-group
. The elements containing authors are#author-group button
. The name elements are.given-name
and.surname
.Have a look at a CSS Selector reference to learn how they work. Ctrl-F in the Elements view of Chrome’s developer tools (F12) lets you test selectors.
You can also use
textContent
of an element containing all the elements you want.#author-group button
contains some extra letters, but#author-group .react-xocs-alternative-link
contains just the names.