In PHP,
mb_strtolower('İspanyolca');
returns
U+0069 i LATIN SMALL LETTER I
U+0307 ̇ COMBINING DOT ABOVE
U+0073 s LATIN SMALL LETTER S
U+0070 p LATIN SMALL LETTER P
etc.
I need to get rid of the "U+0307 ̇ COMBINING DOT ABOVE";
I tried this:
$TheUrl=mb_strtolower('İspanyolca');
$TheUrl=normalizer_normalize($TheUrl,Normalizer::FORM_C);
The combining dot above persists.
Any help would be appreciated.
2
Answers
You can try a custom function in PHP that performs Unicode normalization and then remove characters that are not part of the basic Latin alphabet.
So for example –
To handle this case, you can use the
strtr
function to replace specific characters in the string like my example belowThis will replace the lowercase
'i'
with a dot above and the uppercase'İ'
with a regular lowercase'i'
.