I have this regex to match image src
tags and also alt
or title
tags, but it works only if the src is first, how should I modify it to match these 3 in any order ?
Or are there more accurate ways to do this by parsing html elements ? I assume by regex I might get back array elements but without knowing which is what.
For example now it matches:
img src="landscape.jpg" title="My landscape"
but not
img title="My landscape" src="landscape.jpg"
current regex is:
preg_match_all('#<imgs+[^>]*src="([^"]*)"(?:s+[^>]*(?:alt|title)="([^"]+)")?[^>]*>#is', $url_contents, $image_matches);
2
Answers
I found a simple example using DOMDocument: It does just what I wanted and seems way more reliable than what I tried by regex.
You could use:
(?<=<img)
– behind me is an<img
start tag(?: (src|title|alt)="([^"]+)")?
– look for a src, title, or alt attribute followed by its value and place them into capture groups(?: (src|title|alt)="([^"]+)")?
– again(?: (src|title|alt)="([^"]+)")?
– againhttps://regex101.com/r/GXyAZf/1