Using cheerio, how can i grab 2 separate html contents which follow an html element, and not are inside a specific html element?
what i want to grab is from:
<div>
<time>
<svg>...<svg/>
"first string I want to grab"
<svg>...<svg/>
"second string I want to grab"
</time>
</div>
$(item).find('div').find('time').find('svg:nth-of-type(2)').text();
const result = [...$(item).find('header').find('div').find('span:nth-of-type(1)').find('time').childNodes]
.filter(e =>
e.nodeType === Node.TEXT_NODE && e.textContent.trim()
)
.map(e => e.textContent.trim());
2
Answers
Your example isn’t reproducible, but if you fix your selectors and/or use correct closing tags,
</svg>
rather than<svg/>
, this answer should work out of the box:Output:
As mentioned in the comments, CSS already handles descendants, so you can use
rather than
If this doesn’t work, please share the actual site or full HTML structure you’re working with. In addition to the
</svg>
typo, there is no<span>
in your snippet.It’s surprising there are no class names here. Usually, classes, attributes and ids are more reliable than
nth
tag selectors. Instead of retyping an incorrect excerpt, it’s better to provide the actual HTML, copy-pasted to preserve syntax and attributes.Note that Cheerio only works on static HTML. If the site uses JavaScript to create these elements, that might explain why you can’t find them if you’re pulling down the page with fetch or axios. Ensure the elements are visible in the
view-source:
version of the site–the dev tools element inspector might be misleading. If they’re not in the static HTML, consider using Playwright rather than fetch/cheerio to scrape them.Additional "get text node in Cheerio" threads:
<br>
tag in CheerioYou have to use the parse5 methods for "text nodes":