skip to Main Content

I have some strings that begin with ABC123 (simplified example for demonstration – actual strings may not be numbers or letters). From each string, I want to capture the character after ABC (1) and the character 2 digits after that (3), without capturing 2. I’ve managed to do the first part using a lookbehind, but haven’t been able to figure out the rest.

Regex101 here with what I’ve got so far.

3

Answers


  1. Chosen as BEST ANSWER

    For completeness here's the version of hanlog's answer I ended up using:

    (?<=ABC)(?>[A-Z0-9 -]{2})([A-Z0-9 -])(?>[A-Z0-9 -])(.)
    

    I removed the digit requirement for the non-capturing group and the names for the captured groups to help improve performance and/or memory, but other than that it's basically the same solution.


  2. function getMyChars($input) {
        $output = [false, false];
        $index = strpos($input, "ABC");
        if ($index !== false) {
            $index += strlen("ABC") - 1;
            if ($index + 1 < strlen($input)) {
                $output[0] = $input[$index + 1];
                if ($index + 3 < strlen($input)) {
                    $output[1] = $input[$index + 3];
                }
            }
        }
        return $output;
    }
    
    echo var_dump(getMyChars("ABC123"));
    

    I created an array of two elements inside your function. The elements are initialized with false, which means that the character was not found.

    Then I compute the index of "ABC" to make sure it’s inside your array. It could and should be a parameter, I hard-coded it in this case for the sake of simplicity. Then I check for each character whether its index is inside the string and if so, then fill the output appropriately.

    Finally I return the output.

    Login or Signup to reply.
  3. I would consider using two named capture groups with a non-capturing group in between. Here is an example:

    (?<=ABC)(?<match1>.)(?>d{1})(?<match2>.)
    

    (?<=ABC), is a positive lookbehind that will match anything after the pattern ABC.

    (?<match1>.), is a named capturing group that will catch any single character using the "." pattern.

    (?>d{1}), is a non-capturing group that will match, but not catch, its pattern. The pattern is a single digit, "d{1}", but here you could add something else that will fit your needs.

    (?<match2>.), is the second named capturing group, which will match any single character.

    To use this is PHP you could write like this:

    $inputString = "ABC123";
    $re = '/(?<=ABC)(?<match1>.)(?>d{1})(?<match2>.)/m';
    preg_match($re, $inputString, $matches);
    
    $firstMatch = $matches['match1'];
    $secondMatch = $matches['match2'];
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search