I have an HTML string containing a <script> tag which contains the javascript to create a shadow DOM element via window.customElements.define(…) this in turn contains an innerHTML definition which defines the custom element’s HTML as a string.
This is valid HTML which I’m attempting to process using PHP’s DOMDocument, however it appears that DOMDocument is confused by the content of the innerHTML and starts treating it’s content as nodes it needs to process.
Is there any way to work around this so it no longer confuses DOMDocument?
the pertinent part of the HTML looks somewhat like this:
<script>
class ExampleElement extends HTMLElement {
constructor() {
super();
this.attachShadow({ mode: 'open' })
.innerHTML = '<label>this is what confuses DOMDocument</label>'
}
}
window.customElements.define('example-element', ExampleElement);
</script>
this is then processed in PHP like this
$doc = new DOMDocument();
$doc->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);
libxml then generates an error about the </label> not matching : "Unexpected end tag : label in Entity"
obviously I can either
– break up the innerHTML so that DOMDocument no longer identifies the <label> and </label> as tags using string concatenation
or
– build the element’s content via document.createElement(…) etc
however since this is valid HTML it would be useful to know if it can be parsed as i stands.
2
Answers
You can use the following code to parse html containing the javascript using PHPDOM Document
Per: https://bugs.php.net/bug.php?id=80095
So change
</label>
to</label>
.It will parse clean and JS should interpret
/
as a literal/
in the string.