I’m trying to remove accents and special caracteres except dash(-), underline(_) and preserve the extension the string for exemple:
ÁÉÍÓÚáéíóúâêîôûàèìòùÇãç.,~!@#$%&_-12345.png
to:
AEIOUaeiouaeiouaeiouCac_-12345.png
I came to this result but the problem is that it’s ignoring all dots. I need to ignore only last occurence to preserve the extension from filename.
"ÁÉÍÓÚáéíóúâêîôûàèìòùÇãç.,~!@#$%&-12345.png".normalize(‘NFD’).replace(/[^a-zA-Z0-9-]/g, "")
I already tried negative look behind like this:
/[^a-zA-Z0-9-]+(?<!.)/g
using this reference but I didn’t have success.
"ÁÉÍÓÚáéíóúâêîôûàèìòùÇãç.,~!@#$%&-12.34.5.png".normalize('NFD').replace(/[^a-zA-Z0-9-]+(?<!.)/g, '')
If i have more than a dot in this case it only removes the first .
3
Answers
Instead of checking every charecter for being a file extension, select the whole extension at once
(
(?<name>blah)
is just a named(blah)
group, it’s named just for explaining)A negative lookahead that excludes any
.
followed by any letter or number which in turn is followed by a non-word character can work.An alternative to
[^a-zA-Z0-9-._]
is[^w.-]
.RegEx101
Explanation
Example
In your pattern you forgot to exclude the
_
as you want to keep that in the result.You are using a negative lookbehind that asserts that from the current position, there is not a dot directly to the left.
The negated character class
[^a-zA-Z0-9-]+
can match a dot, but the lookbehind(?<!.)
fails if it did match a dot, so it will never match any dot.What you could do, is to match this character class
[^a-zA-Z0-9-]+
, but assert that from the current position to the right there is a dot, followed by 1+ chars except a dot or a whitespace char till the end of the string using a positive lookahead(?=.*.[^s.]+$)