How to find and select in any html only text with spaces but without tabs and line breaks and not select the tags themselves.
From the opposite, I succeeded, but as I looked above – no
<html>
<body>
<h1> text1</h1>
<p>text2</p>
text14
<p> text3 </p>
text2
</body>
</html>
This is what I got:
<[^>]+>(.+?)</[^>]+>
2
Answers
Assuming you wanted
then you can use parseFromString and createNodeIterator
and do this:
The requirements as in the OP’s own words …
The approach needs to be manifold. This is due to neither a pure regex based nor a
DOMParser
andNodeIterator
based approach are capable of returning the OP’s expected result.But a
NodeIterator
instance with an additionally appliedfilter
where the latter uses 2 regex pattern basedtest
s does the job …