skip to Main Content

Hoe to write a regex that validates middle initial in PHP. Because some people have more than one middle initial the regex should allow between one and and thee middle initials.
The regex should allow one letter with or wothout a period, two or three letters each followed by a period, or two or three letters each followed by a period and a single space.

Here is the list of allowed stings where ‘A’ means any letter of upper or lower case, including unicode letters in foreign alphabets:

'A', 'A.', 'A.A.', 'A. A.', 'A.A.A.', 'A. A. A.'

Here is a regex that I wrote as a starting point to validate exactly one midlle initial in English alpabet followed by an optional period.

$pattern = "/^([a-zA-Z]{1}[.]{0,1})?$/";    
preg_match_all($pattern, $input_string, $match) > 0;

2

Answers


  1. I guess much of this will depend on what you define as a valid name. EG?

    <?php
    
        $names = ['Bill', 'Bill.', 'Bill.Smith.', 'Bill. Smith.', 'Bill.B.Smith.', 'Bill. B. Smith.', 'Bill B Smith', 'Bill Smith'];
    
        $pattern = '/(?:[A-Za-zА-Яа-яΑ-Ωα-ω].s?|[A-Za-zА-Яа-яΑ-Ωα-ω]{2,3}.s?|s[A-Za-zА-Яа-яΑ-Ωα-ω]s)/u';
        foreach( $names as $name ){
            if( preg_match_all($pattern, $name, $match) ){
                echo "$name is validn";
            } else {
                echo "$name is invalidn";
            }
        }
    

    The results will be:

    Bill is invalid
    Bill. is valid
    Bill.Smith. is valid
    Bill. Smith. is valid
    Bill.B.Smith. is valid
    Bill. B. Smith. is valid
    Bill B Smith is valid
    Bill Smith is invalid
    
    Login or Signup to reply.
  2. Ok, so we will have 2 kinds of matches across the entire string.

    • A. A. A. pattern where every character is followed by a ..
    • A A A pattern where every character is followed by spaces and not period characters.

    As confirmed by you, A. A A. is NOT a valid pattern according to
    your requirement. So, we will not focus on it.

    As far as regex is concerned, it will be as below,

    • p{L}.s* to match A. A. A. pattern. The p{L} is used to match a single Unicode codepoint.

    • p{L}s* to match A A A pattern.

    Overall, the regex will be /^((p{L}.s*)+|(p{L}s*)+)$/iu. The | is used to indicate an alternative match. So, it could be either the first capturing group or the 2nd one. The u flag used to treat subject strings and patterns as UTF-8. (see more info)

    Snippet:

    <?php
    
    $tests = [
        'A',
        'A.',
        'A.A.',
        'A. A.',
        'A. A. A.',
        'A. A',
        'A. A A.',
        'A A. A.',
        'Ω.',
        'A.Ω.',
        'Ω',
        'A Ω.',
    ];
    
    foreach($tests as $test){
        echo $test," => ", var_dump(preg_match('/^((p{L}.s*)+|(p{L}s*)+)$/ui', $test) === 1), PHP_EOL;
    }
    

    Live Demo

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search