skip to Main Content

So I have been practicing and reading about assertions, and ran into this problem:

const text14 = "123 456 7890";
const pattern12 = /d+(?!0)/g;

let matches12 = text14.match(pattern12);
console.log(matches12);

the output is supposed to be [‘123’, ‘456’] Yet it isn’t.
its [‘123’, ‘456’, ‘7890’]

After tinkering with it a bit I realized that when I put a space on the assertion as well as on the string itself, it removed, yet only the 9.

const text14 = "123 456 789 0";
const pattern12 = /d+(?! 0)/g;

let matches12 = text14.match(pattern12);
console.log(matches12);

Ouput:

['123', '456', '78', '0']

This made me believe that there is a different way in which assertion works with numbers.
The desired outcome I’ve been trying to get is to turn the original "123 456 7890" into [‘123’, ‘456’] using the negative lookahead assertion: ‘x(?!y)’.

2

Answers


  1. The regular expression /d+(?!0)/g will match all substrings that:

    • begin with a sequence of digits (as many as possible, at least one)
    • are not followed by a 0

    The problem is that 0 is itself a digit. The regex keeps accepting digits including zeroes until it encounters a character that isn’t a digit, and only then does it check that the character after that is not a zero. So the negative lookahead never comes into play.

    You might be tempted to simply use negative lookbehind instead, like so:

    "123 456 7890".match(/d+(?<!0)/g); // ["123", "456", "789"]
    

    But in such a case the regular expression will simply stop before the zero instead, and not discard the entire sequence as you wished. Instead, you should first match a sequence of digits that ends in a nonzero digit, then make sure there isn’t another digit after that.

    "123 456 7890".match(/d*[1-9](?!d)/g); // ["123", "456"]
    

    Keep in mind that the way you write a regular expression can affect its performance. I would not expect this one to be very efficient. A more naïve approach would be to simply accept any sequence of digits and then filter the results with JavaScript:

    "123 456 7890"
        .match(/d+/g)                  // ["123, 456", "7890"]
        .filter(s => s.at(-1) !== "0"); // ["123, 456"]
    
    Login or Signup to reply.
  2. No, there is no difference in how the regular expression engine assertion treats digits or other characters.

    Your digit match is too "greedy": the d+ is matching all of the digits (including 0) before it checks the negative lookahead (?!0).

    So it does something like this:

    • Does 7890 match d+? yes
    • Is 7890 followed by (?!0)? no (because there are no remaining digits)
    • Therefore 7890 is successfully matched.

    You can try this out by going to: https://regex101.com/
    Enter your regular expression and test strings, then choose the "Regex Debugger" from the left-hand sidebar menu (under "TOOLS").

    regular-expressions.info is another great resource with a really good explanation of lookahead and lookbehind assertions.

    There are a couple of alternative patterns that might do what you want.

    b[1-9]+(?!0)b

    • exclude 0 from the digit match allows the negative lookahead will come into play
    • adding word boundary b checks at the start and end allows it will match whole groups (avoiding partial matches like 78)
    • however, this will never match any group that contains 0 (which may not be what you want)

    Results:

    "123 456 7890" -> [123, 456]
    "123 456 789" -> [123, 456, 789]
    "1023 456 789" -> [456, 789]
    "000 1230 0456" -> []
    

    bd+(?<!0)b

    • this uses a negative lookbehind assertion
    • "match every group of digits that doesn’t end with 0"
    • this allows 0 at the start or middle of the group

    Results:

    "123 456 7890" -> [123, 456]
    "123 456 789" -> [123, 456, 789]
    "1023 456 789" -> [1023, 456, 789]
    "000 1230 0456" -> [0456]
    

    Note that consistent browser/engine support for negative lookbehind is only relatively new (at time of writing).

    It’s been available in Chrome since 2017-10, nodejs since 2018-03, Firefox since 2020-06, Safari since 2023-03.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search