I’m trying to decode a text that contains extended ASCII characters but when I try to convert the character I get the wrong value. Like this:
echo "“<br>";
echo ord("“")."<br>";
echo chr(ord("“"))."<br>";
And this is my output:
“
226
�
The ASCII value of the character "“" is 147, not 226. And instead of the � symbol, I want to get "“" character back.
I’m using UTF-8
<meta charset="utf-8">
I have tried changing to different charsets but it didn’t work.
3
Answers
You’re incorrect about the
“
character, the UTF-8 encoding is two bytes:c293
.See: SET TRANSMIT STATE.
In the manual for ord() it says:
On top of this, if I actually convert the
'“'
charachter to hexadecimal, I get:e2809c
. So it’s a triplet. Never trust what you read online. 😏See: https://3v4l.org/57UV8
1st
U+201C
Left Double Quotation Mark is UTF-8 byte sequenceE2 80 9C
(hexadecimal) i.e. decimal226 128 156
2nd
ord
— Convert the first byte of a string to a value between 0 and 255Result:
ord("“")
returns226
…Instead of
ord
andchr
pair, usemb_ord
and its complementmb_chr
, e.g. as follows:Result:
.SO74045685.php
Edit you can get Windows-1251 code (
147
) for character“
(U+201C, Left Double Quotation Mark) as follows:There is no ASCII representation for “, as has already been said it is multibyte, UTF-8 to be precise:
ord() and chr() don’t support this, you’re only looking at the first byte of up to four needed for a particular character. Fortunately there are functions that does:
But why do you need to transform it back and forth? It seems you already have the character in your code :), not as a value but as the actual visual representation.