skip to Main Content

I’m trying to use a regex in javascript where special characters and leading and trailing spaces are not allowed in an input element. Maximum character length is 50.

The regular expression I’m using is

/^[a-zA-Z0-9]+([ ]?[a-zA-Z0-9]+)*$/

But this is causing issues when the string is large and there is a special character in the end. On research I found that this is happening due to catastrophic backtracking but I’m unable to optimise the regex. Kindly help me find an optimum solution to this issue.

I tried debouncing the keyup event for the input element but that is also not solving the issue.

2

Answers


  1. You might start the pattern matching a single char to prevent the lookahead firing on every line followed by asserting 0-49 chars after it.

    Then either add the pattern in a capture group in a lookahead, and then match the string using the backreference to group 1 as lookaheads are atomic.

    Using a case insensitive pattern with /i

    ^[a-z0-9](?=[a-z0-9 ]{0,49}$)(?=([a-z0-9]+(?: [a-z0-9]+)*))1$
    

    See a regex demo.

    Login or Signup to reply.
  2. It’s as simple as removing the optional quantifier (?) from behind the space:
    /^[a-zA-Z0-9]+([ ][a-zA-Z0-9]+)*$/ should be fine.

    We need to argue two things:

    1. This doesn’t change what the RegEx matches: The optional quantifier was unnecessary. The Kleene star (*) already makes the entire space-delimited part (([ ][a-zA-Z0-9]+)) optional. If there is no space, you’d have just alphanumerics, which the [a-zA-Z0-9]+ would’ve matched.
    2. This changes the runtime drastically: A character is either a space or matches [a-zA-Z0-9]. There are no "ambiguities"; there is only one state the RegEx engine can advance into: If it is an alphanumeric, it must stay in the greedy quantifier; if it is a space, it must advance, and then expect at least one alphanumeric. In particular, what can’t happen anymore is that the RegEx engine has two options: Whether to enter another iteration of the Kleene star or stay in the current [a-zA-Z0-9]+.

    I’ve applied some more changes, moving the Kleene star to the front and using the i flag to shorten it further (you can then omit the A-Z). The [ and ] brackets around the space were also unnecessary. Using The fourth bird’s tests, here’s how it performs: /^([a-z0-9]+ )*[a-z0-9]+$/i.

    Like your current RegEx, this doesn’t enforce the length requirement. I’d recommend enforcing the length requirement in JavaScript before testing against the RegEx. This will be the most readable & efficient.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search