I need to match a string which contains dot (.
) and no dot.
Here are the sample patterns:
res:shanghai:45610
res:chicago.usa:57450
Regex I’m using:
res:[w]{4,15}:[1-9][0-9]*
It matches only res:shanghai:45610
, but it should match both.
Since the second pattern contains a dot (.
) between chicago.usa
, it doesn’t match.
How do I alter regex to match res:chicago.usa:57450
too. Specifically, it should allow a single dot in the middle of word characters, while still restricting the length of the field to a min 4 and a max 15 characters including the dot.
3
Answers
Just add the dot to the character class:
But it would also match strings with all dots.
Basically, you want the middle part to match both
and
Lookaheads can be used to perform "AND" operations.
Another possibility is doing the length check outside of the regex.
As mentioned, one can check the length outside of regex. Since the dot can show up only after a character and has to be followed by a character the pattern is extremely simple
and altogether, for example
The one advantage of this is the pattern’s simplicity — that it avoids the advanced and complex combined use of lookaheads and consuming patterns.
Since v5.32 we can also write the two conditions as a "chained comparison"
Also see this discussed in perlop, Operator-Precedence-and-Associativity