skip to Main Content

I have been trying to parse html string to DOM nodes and i keep having the same issue with the anchor links irrespective of the approach i use,

Here is the html string for testing:

let htmlString = '<div><h2>Absolute URLs</h2><p><a href="https://www.w3.org/">W3C</a></p><p><a href="https://www.google.com/">Google</a></p><h2>Relative URLs</h2><p><a href="html_images.asp">HTML Images</a></p><p><a href="/css/default.asp">CSS Tutorial</a></p></div>';

and i try to parse by the following methods and i get the same output, not sure how to resolve this,

let template = document.createElement('template');
htmlString = htmlString.trim();
template.innerHTML = htmlString;
  
console.log(template.content.firstChild);
let docs = new DOMParser().parseFromString(htmlString,'text/html');
console.log(docs.body.firstChild);

The converted DOM is following where you can see that the slashes in the anchor tags are getting replaced with empty character and href is repeated in the closing anchor tag, its with all the anchor tags

<div><h2>Absolute&nbsp;URLs</h2><p><a href="https: www.w3.org="" "="">W3C</a href="https:></p><p><a href="https: www.google.com="" "="">Google</a href="https:></p><h2>Relative&nbsp;URLs</h2><p><a href="html_images.asp">HTML&nbsp;Images</a href="html_images.asp"></p><p><a href=" css="" default.asp"="">CSS&nbsp;Tutorial</a href="></p></div>

any help to how to go about it would be highly appreciated.

2

Answers


  1. If you don’t need to do this:

    let template = document.createElement('template');
    

    You can just have in your HTML something to put your stuff in like:

    <div id="myContainer"></div>
    

    And then:

    document.getElementById("myContainer").innerHTML = htmlString;
    

    This its how I use to do it and it works.

    You can also add your anchor empty and then add the data via JS.

    Login or Signup to reply.
  2. From looking at your Codepen, it looks like the issue is that all of the spaces in htmlString are non breaking spaces. (that’s why you see the &nbsp; in the output)

    Stack Overflow must have replaced these with normal spaces, which was why we couldn’t reproduce the issue at first.

    To solve it, use the String#normalize method:

    let docs = new DOMParser().parseFromString(htmlString.normalize('NFKD'),'text/html');
    

    You could also manually replace all of the non breaking spaces (xa0) with normal spaces:

    let docs = new DOMParser().parseFromString(htmlString.replace(/xa0/g, ' '),'text/html');
    
    // You can also use replaceAll
    let docs = new DOMParser().parseFromString(htmlString.replaceAll('xa0', ' '),'text/html');
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search