With the below function trying to display the special characters
function replaceSpecialChar(text) {
return text.replace(/[^x20-x7EnxC0-xFFu00C0-u00FFu0152u0153u0178]+/g, '');
}
But following characters are not displayed as expected
Ÿ, Œ and œ
Tried by adding it’s individual ascii code in the regex but it is returning different values for each of those
For Ÿ it is returning value as x
Œ it is returning value as R
œ it is returning value as S
2
Answers
The issue you’re experiencing might be due to the way JavaScript handles Unicode characters. The range you’re using in your regular expression (
xC0-xFF
) only covers the Latin-1 Supplement Unicode block, which includes characters fromÀ
toÿ
, but does not include characters likeŒ
,œ
, andŸ
because they are part of the Latin Extended-A block.To include these characters, you would need to extend your range to cover the necessary Unicode blocks. However, JavaScript’s handling of Unicode can be a bit tricky, especially when dealing with characters outside of the Basic Multilingual Plane (BMP).
Here’s an updated version of your function that should handle the characters you mentioned:
This function first normalizes the input text to its decomposed form (using the ‘NFD’ form), where combined characters like
é
are split into their base charactere
and the combining accent mark. Then it removes all combining accent marks (theu0300-u036f
range covers all combining diacritical marks), and finally removes all non-alphanumeric characters.Please note that this function will remove all special characters, not just the ones you mentioned. If you want to keep certain special characters, you will need to adjust the regular expression in the last
replace
call accordingly.Also, be aware that JavaScript’s handling of Unicode can vary between environments, so you may need to adjust this code depending on where you’re running it.
Use unicode property on your regular expression.
It will:
So your resulting regexp would look like this: