skip to Main Content

I am using PHP 7.4 as well as PHP 8.2 and I have a regex that I use in PHP to match words (names). To be completely honest, I barely recognize this regex monster I created. Thus this question is asking for assistance in figuring it out. It is basically this:

$is_word = preg_match('/^(?![aeiou]{3,})(?:D(?![^aeiou]{4,}[aeiou]*)(?![aeiou]{4,})){3,}$/i', $name);

I’ve been using it for about 6+ years to match names in a script I have created: It will basically return a boolean of TRUE or FALSE if it matches a word pattern.

But today it returned false on two names which should be deemed valid:

  • Li
  • Drantch

To test this out, you can use the following batch of test names; using pseudo names for example sake:

  • Nartinez
  • Drantch
  • Dratch
  • Xtmnprwq
  • Yelendez
  • Boldberg
  • Yelenovich
  • Allash
  • Mohamed
  • Li

I attempted to adjust the regex to set the second {x,x} to {5,}

$is_word = preg_match('/^(?![aeiou]{3,})(?:D(?![^aeiou]{5,}[aeiou]*)(?![aeiou]{4,})){3,}$/i', $name);

It helped in cases which match names like “Drantch” but then it still completely missed two-letter names like “Li.”

How can this regex be tweaked to properly match all names? If not all names, how can it be adjusted to properly match “Drantch” and other obvious names other that “Li.”

Note that, “Xtmnprwq” is a fake test name so I can test negatives as well as positives.

3

Answers


  1. The {3,} in your non-capturing group mandates a minimum string length of 3 characters. If you want to allow Li, reduce it to {2,}.

    The negated characters class inside your negated lookahead ((?![^aeiou]{4,}) has a minimum qualification of 4 consonants, so ntch satisfies that and disqualifies the input string. If you want to allow Drantch, increase it to (?![^aeiou]{5,}.

    Code: (Demo)

    $array = [
        'Nartinez',
        'Drantch',
        'Dratch',
        'Xtmnprwq',
        'Yelendez',
        'Boldberg',
        'Yelenovich',
        'Allash',
        'Mohamed',
        'Li',
    ];
    
    $regex = <<<REGEX
    /
    ^
    (?![aeiou]{3,})
    (?:
       D(?![^aeiou]{5,}[aeiou]*)
       (?![aeiou]{4,})
    ){2,}
    $
    /ix
    REGEX;
    
    var_export(preg_grep($regex, $array));
    

    Output:

    array (
      0 => 'Nartinez',
      1 => 'Drantch',
      2 => 'Dratch',
      4 => 'Yelendez',
      5 => 'Boldberg',
      6 => 'Yelenovich',
      7 => 'Allash',
      8 => 'Mohamed',
      9 => 'Li',
    )
    

    As for improving your pattern’s readability, it would be better to express your exact intention, then generate a set of negated lookaheads before the "core" requirement that all characters must be letters and have a minimum character length.

    $regex = <<<REGEX
    /
    ^
    (?![aeiou]{3})    #doesn't start with 3 consecutive vowels
    (?!.*[aeiou]{4})  #doesn't contain 4 consecutive vowels
    (?!.*[^aeiou]{5}) #doesn't contain 5 consecutive consonants
    [a-z]{2,}         #contains only letters, minimum of 2 characters
    $
    /ix
    REGEX;
    
    Login or Signup to reply.
  2. Your regexp has the following constraints on words:

    • ^(?![aeiou]{3,}) – Can’t begin with 3 or more consecutive vowels
    • (?![^aeiou]{4,} – Can’t have 4 or more consecutive consonants in the middle
    • (?![aeiou]{4,}) – Can’t have 4 or more consecutive vowels in the middle
    • {3,} – Must be at least 3 characters long

    Li violates the 3 characters requirement.

    Drantch violates the 4 consecutive consonants restriction.

    Tweak or remove these bits of the regexp to changes the restrictions to allow these names.

    Login or Signup to reply.
  3. To understand what you are doing, feel free to use visual tools like https://regex101.com/r/vICSfO/1

    To allow us to help you, I recommend asking business logic, some practical case.
    For example, your regex looks way complicated to me, but perhaps you need it exactly such for some reason.
    At a first glance, it can be simplified:

    ^(?![aeiou]{3,})[a-zA-Z]{2,}$
    

    At least, you need to replace {3,} by {2,} if you need to match 2-characters words.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search