skip to Main Content

Encountered a strange behaviour with negated charatcer classes traversing newlines without m/multiline provided.

> node
Welcome to Node.js v22.7.0.
Type ".help" for more information.
> 'abcnabcnabcn'.replace(/b[^z]+/g, '')
'a'
> 'abcnabcnabcn'.replace(/b[^zn]+/g, '')
'ananan'

I expected that the first result would only be the case when the m multiline flag is enabled:

> 'abcnabcnabcn'.replace(/b[^z]+/gm, '')
'a'

Is this a bug, or is this expected? If it is expected, what is the reasoning?

I was able to work around it with this usage of ?$ at the end:

> 'abcnabcnabcn'.replace(/b[^z]+?$/g, '')
'a'
> 'abcnabcnabcn'.replace(/b[^z]+?$/gm, '')
'ananan'

2

Answers


  1. In your first regex pattern:

    b[^z]+
    

    The last term [^z]+ will match any character other than z, which includes all other characters including whitespace and newlines.

    As a side note, in cases where we want .* to match across newlines when there is no dot all mode available, we can use [sS]*, which operates using a similar logic to your first pattern.

    Login or Signup to reply.
  2. From the documentation you linked to

    … if m is used, ^ and $ change from matching at only the start or end of the entire string to the start or end of any line within the string.

    You aren’t using these in your regular expression so I wouldn’t expect to see a difference in behaviour when removing the multiline option.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search