I’m working on a regex to match phrases in a HTML string. For example, I want to find every instance of “artificial intelligence” and return the <span>
tag that immediately precedes it.
The trouble I have is that the my regex only returns one large match.
Here is a link to an online regex builder I’ve been using: https://regex101.com/r/rK9yO9/1
I am looking to return the following two matches:
<span m='3'>
<span m='13'>
Example string:
<p><span m='2'>of</span> <span m='3'>artificial</span>
<span m='4'>intelligence.</span><span m='4'>So</span>
<span m='5'>that</span> <span m='6'>seems</span>
<span m='9'>good.</span> <span m='10'>The</span>
<span m='11'>impact</span> <span m='12'>of</span>
<span m='13'>artificial</span> <span m='14'>intelligence,</span>
<span m='15'>on</span> </p>
N.b there are no newlines in the text, I added those for readability.
The regex I have so far is:
(<span.*>)artificial.?</span>.?<span.*>intelligence.?</span>
Which returns the following match:
<span m='2'>of</span> <span m='3'>artificial</span>
<span m='4'>intelligence.</span><span m='4'>So</span>
<span m='5'>that</span> <span m='6'>seems</span>
<span m='9'>good.</span> <span m='10'>The</span>
<span m='11'>impact</span> <span m='12'>of</span>
<span m='13'>artificial</span> <span m='14'>intelligence,</span>
2
Answers
You are using greedy regex. To make matching stop at first occurrence use ?
will match
you can easily get the first group matched
Try this regex:
See DEMO
It should match only selected tags