I am trying to convert a table to remove some tags used that I don’t want, programmatically. I wrote a recursive function that gets called on the table element, and calls itself on all children, making a list of those children and replacing the current children with these sanitized children. However the nodes don’t get added correctly, but instead a string representation of [object HTMLTableSectionElement] or [object Text] ends up in the output.
(For those interested in context: I use thead, tbody, and a couple custom tags for a fancy table, but I also want to be able to press a button to copy the table into wikipedia pages which don’t support those tags, so while copying to clipboard I have to remove them or it looks terrible)
This is my function:
function prepHTMLforExport(element){
//recursively goes into an HTML element, preparing it for export by:
//removing thead and tbody statements but leaving their contents in the parent element
//removing floattext tags
//replacing hyperlink a tags with ...
//go through child elements and prep each for export
var newchildren = []
var children = element.childNodes
for (var i=0;i<children.length;i++){
var newkid = prepHTMLforExport(children[i])
//if the new kid is actually a list, concat the lists
if (newkid instanceof Array){
newchildren.concat(newkid)
} else {
newchildren.push(newkid)
}
}
//if to be removed, return a list of children
if (element.tagName in ["floattext", "table-body", "table-head"]){
return newchildren
} else {
//else add children to self and return self
if (newchildren.length > 0) {
element.replaceChildren(newchildren)
}
return element
}
}
It should work on basically any HTML table.
I suspected originally that I was checking the wrong children or using the wrong method to replace them, so I also tried it with var children = element.children
, and I looked if there was a different function for the replacement, like replaceWith, but on closer inspection I discovered that this didn’t help. Elements already get converted to the wrong form in the nodelist of their parents while they are being processed, not only after I get them replaced with their sanitized version. To test this, there is this table:
<table>
<th>sample text</th>
<th>some more sample text</th>
</table>
When stepping through the code, if the code is just done preparing the first cell and if now about the recurse into the second cell, if you execute console.log(element)
, this results in
<table>
<th>[object Text]</th>
<th>some more sample text</th>
</table>
2
Answers
Your issue is that
.element.replaceChildren
expects each child to be a separate parameter, however you are passing an array.So
.replaceChildren
converts the array to a string, givingthe extra
,
comma in the middle is an additional hint that this was an array converted to a stringYou can convert an array to parameters using rest parameters – that line of your code becomes:
Updated snippet:
In addition to the array mentioned by @fdomn-m in his answer, there are three other issues I see.
Using element.childNodes can return text nodes like " " or "n", not just element nodes. Using element.children will fix that.
element.tagNames, returns tags in uppercase and you are comparing to lowercase strings.
Finally, you are comparing against "table-head" and "table-body" tags. I presume that you meant "thead" and "tbody". I don’t know what the floattext is that you are checking for.
Here is your code with all three of those changes.