I use DOMPurify library (https://github.com/cure53/DOMPurify) to clean up html code copied from google docs.
I would like to remove the span tags from the copied text but keep the text inside the tags as well as any strong tags included in the deleted span tags.
I manage to remove the span tags while keeping the text inside the tags but any strong tags are also removed.
Example
DOMPurify.sanitize("<p>
<span style="background-color:transparent;color:#000000;"><strong>Some strong text</strong></span>
</p>", {
ALLOWED_TAGS: ['p','strong']
})
Output
<p>Some strong text</p>
Expected output
<p><strong>Some strong text</strong></p>
I also tried with this kind of hook
DOMPurify.addHook("afterSanitizeAttributes", function (node, data, config) {
if (node.nodeName === "SPAN") {
node.replaceWith(node.textContent ?? "");
}
});
But the output is the same, <strong>
tags inside <span>
are also deleted.
Can you please help me to keep (sub) <strong>
tags after “sanitize”?
Many thanks
2
Answers
IT goldman, Thanks for your reply and especially thanks for pointing out that my code worked. Your comment prompted me to further my testing. In fact I use DOMPurify library in addition to ckeditor 5 (https://ckeditor.com/ckeditor-5/). Your comment pushed me to test my DOMPurify code outside the ckeditor context and indeed I realized by doing this that my code worked as expected.
So I then investigated to understand what was happening by continuing testing in ckeditor context.
In fact I had left the “font” plugin active in ckeditor configuration and because of that ckeditor was constantly adding this unwanted
<span>
tag after DOMPurify sanitization. What disturbed me.This post helped me identify and fix my problem https://github.com/ckeditor/ckeditor5/issues/6492
After removing the font plugin in ckeditor configuration everything works as expected.
In fact I don't even need to add DOMPurify anymore because ckeditor's Paste from Office/Paste from Google Docs feature cleans up google docs code perfectly for my needs.
Thanks again for your help and for taking the time to answer me because it allowed me to take a step back and fix my problem.
Thanks also for the idea of the temporary conversion of strong to strong. I will keep this logic in my back pocket because it will certainly be useful to me in other situations.
I will modify the title of my post and add the label ckeditor because in fact the problem concerned more ckeditor than DOMPurify.
Thanks again
First of all, your code does work.
But a workaround would have been be to replace all
<strong>
with[strong]
, purify, then replace back. You can add logic to check beforehand if the string contains[strong]
to make it robust. I will assume it doesn’t exist.