I want to find the text that starts with any character(but not space) and ends with space
I can use S+s
wonders like <a href="https://example" title="Perito Moreno Glacier">Perito Moreno Glacier</a> and
The issue is that the a
link element is also matched, so I want to skip the a
element for this searching
I can use <a.*?a>
for searching a
element.
So I want to combine the logic S+s
but skip <a.*?a>
I have tried (S+s)|(<a.*?a>) but it didn’t work.
The expected result is
wonders
like
<a href="https://example" title="Perito Moreno Glacier">Perito Moreno Glacier</a>
and
2
Answers
As implied in the comments about regexp and html… I will use dom instead. Just in case your string may contain nested html.
However, if your string is a flat and valid html, you could also have a look at this answer which is about extracting words or "phrases".
Maybe this?
Explanation