Why do negated character classes in JavaScript Regular Expressions traverse newlines even with multiline mode disabled

balupton
September 13, 2024
202 views
1 vote
2 Answers

Encountered a strange behaviour with negated charatcer classes traversing newlines without m/multiline provided.

> node
Welcome to Node.js v22.7.0.
Type ".help" for more information.
> 'abcnabcnabcn'.replace(/b[^z]+/g, '')
'a'
> 'abcnabcnabcn'.replace(/b[^zn]+/g, '')
'ananan'

I expected that the first result would only be the case when the m multiline flag is enabled:

> 'abcnabcnabcn'.replace(/b[^z]+/gm, '')
'a'

Is this a bug, or is this expected? If it is expected, what is the reasoning?

I was able to work around it with this usage of ?$ at the end:

> 'abcnabcnabcn'.replace(/b[^z]+?$/g, '')
'a'
> 'abcnabcnabcn'.replace(/b[^z]+?$/gm, '')
'ananan'

Tags: javascript regex

Answers

- TimBiegeleisen
- September 13, 2024 at 4:00 pm
- 0 votes
0
In your first regex pattern:
```
b[^z]+
```
The last term [^z]+ will match any character other than z, which includes all other characters including whitespace and newlines.

As a side note, in cases where we want .* to match across newlines when there is no dot all mode available, we can use [sS]*, which operates using a similar logic to your first pattern.
Login or Signup to reply.

- phuzi
- September 13, 2024 at 4:32 pm
- 0 votes
0
From the documentation you linked to

… if m is used, ^ and $ change from matching at only the start or end of the entire string to the start or end of any line within the string.

You aren’t using these in your regular expression so I wouldn’t expect to see a difference in behaviour when removing the multiline option.

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.