I’m using preg_match_all() like this:
preg_match_all('/("user":"(.*?)".*?-->)/', $input_lines, $output_array);
On a string, the idea is that I want it to get whatever comes after "user" in a commented out block (it’s a long story). So let’s say $input_lines is something like:
<!-- block {"user":"josh-hart", 1234566 ,"hide":false} /-->
<!-- block {"user":"jalen-brunson", 7744633 ,"hide":true} /-->
<!-- block {"user":"julius-randle", 333333,"hide":false} /-->
<!-- block {"user":"obi-toppin", 4hh3n33n33n, /-->
<!-- block {"user":"rj-barrett", nmremxxx!! ,"hide":true} /-->
<!-- block {"user":"mitch-robinson",yahaoao /-->
I want it to match the user. But here’s the the thing, I only want the user if "hide":true
does not appear before the /-->
. So for this string I would want the matches to be:
josh-hart, julius-randle, obi-toppin, mitch-robinson
What is this called in regex terms and how do i do it?
3
Answers
You may try that:
Explanation:
user":"([^"]+)"
– after matchinguser":"
it looks for characters inside"
until it reaches a"
.()
ensures 1st capture group.(?!.*"hide":true
negative lookahead to figure there is no"hide":true
after that.(?=.*/-->)
positive lookahead to ensure that/-->
is not preceded by"hide":true
Demo
As suggested by @bobblebubble, you can use the
(*SKIP)(*FAIL)
combo to skip whatever you don’t want to match:Try it on regex101.com.
Assuming that opening and closing a comment is from
<!--
to-->
and there is no other use of these in between, you can first get the matches out of the way that contain<!-- ... "hide":true ... -->
without crossing the opening or closing of a comment.Then you can get a single match of the username, still in between the comment markers and independent of the order of appearance.
The pattern matches:
<!--
Match literaly(?:(?!-->|"hide":).)*+
Optionally repeat matching any character not directly followed by either-->
or"hide":
using a Tempered greedy token"hide":trueb
Match"hide":true
followed by a word boundary to prevent a partial word match(?:(?!-->).)*/-->
Match until the closing-->
(*SKIP)(*F)
Skip the current match|
Or<!--
Match literally(?:(?!-->|"user").)*
Optionally repeat matching any character not directly followed by either-->
or"user:
"user":"K
Match"user":"
and forget what is matched so far[^"]+
Match 1+ chars other than"
(the username that you want to match)(?="(?:(?!-->).)*-->)
Assert-->
to the rightNote that you can make the matching of the username more specific, as for now it matches 1 or more characters other than a double quote with
[^"]+
which can also be a space or a newline. If you want to match only non whitespace characters except for a double quote, than you can change it to[^s"]+
See a regex demo.