I’m struggling trying to load an HTML fragment with Cheerio.
var htmlString = '<div class="artist"><i class="user blue circle icon"></i> Skyy</div>';
var $ = cheerio.load(data);
console.info($.html());
Outputs
<html><head></head><body><div class="artist"><i class="user blue circle icon"></i> Skyy</div></body></html>
My problem is, I think, that Cheerio wraps my content within an HTML document, which makes it difficult to access the node directly.
I could eventually use this selector, it works pretty fine:
var el = $('body').children().first();
But it doesn’t always work. For instance,
var htmlString = '<meta name="description" content="My description">';
var $ = cheerio.load(data);
console.info($.html());
Will output a different kind of document, where var el = $('body').children().first();
will not work:
<html><head><meta name="description" content="My description"></head><body></body></html>
So, is there a way to load an HTML fragment and to access it as a Cheerio element without using a selector?
I want to be able to use the Cheerio functions like .text()
, .html()
or .attr()
, on the populated node.
3
Answers
I found out a solution.
By loading a blank document, I can add my html string manually to it - so i'm sure it will be in the
<body/>
, even if it's a meta element that Cheerio would normally load in the<head/>
.There is an option in cheerio to disable wrapping your html in other tags, the third argument (the second takes an object containing additional options; we can set it to null) of
cheerio.load
:You can view the source for more info.
This answer offers a great start, but it doesn’t show a consistent way to access the tag.
$(":root").first();
seems like a good approach to extract the first tag, not extensively tested but looks promising on a spot check.