How do I display extended ascii characters in my php code? - PhpOut

expresso500
October 12, 2022
68 views
2 votes
3 Answers

I’m trying to decode a text that contains extended ASCII characters but when I try to convert the character I get the wrong value. Like this:

    echo "“<br>";
    echo ord("“")."<br>";
    echo chr(ord("“"))."<br>";

And this is my output:

“
226
�

The ASCII value of the character "“" is 147, not 226. And instead of the � symbol, I want to get "“" character back.

I’m using UTF-8

<meta charset="utf-8">

I have tried changing to different charsets but it didn’t work.

Tags: ascii character decode php utf-8

Answers

- KIKOSoftware
- October 12, 2022 at 7:34 pm
- 0 votes
0
You’re incorrect about the “ character, the UTF-8 encoding is two bytes: c293.

See: SET TRANSMIT STATE.

In the manual for ord() it says:

However, note that this function is not aware of any string encoding,
and in particular will never identify a Unicode code point in a
multi-byte encoding such as UTF-8 or UTF-16.

On top of this, if I actually convert the '“' charachter to hexadecimal, I get: e2809c. So it’s a triplet. Never trust what you read online. 😏

See: https://3v4l.org/57UV8

Login or Signup to reply.

- JosefZ
- October 12, 2022 at 8:07 pm
- 0 votes
0
1st U+201C Left Double Quotation Mark is UTF-8 byte sequence E2 80 9C (hexadecimal) i.e. decimal 226 128 156

2nd ord — Convert the first byte of a string to a value between 0 and 255

Result: ord("“") returns 226…

Instead of ord and chr pair, use mb_ord and its complement mb_chr, e.g. as follows:
```
<?php
echo "“<br>";
echo mb_ord("“")."<br>";
echo mb_chr(mb_ord("“"))."<br>";
?>
```
Result: .SO74045685.php

“
8220
“

Edit you can get Windows-1251 code (147) for character “ (U+201C, Left Double Quotation Mark) as follows:
```
echo ord(mb_convert_encoding("“","Windows-1251","UTF-8"));  //147
```
Login or Signup to reply.

- Torbj246rnStabo
- October 12, 2022 at 8:10 pm
- 0 votes
0
There is no ASCII representation for “, as has already been said it is multibyte, UTF-8 to be precise:
```
echo mb_detect_encoding("“"); // UTF-8
```
ord() and chr() don’t support this, you’re only looking at the first byte of up to four needed for a particular character. Fortunately there are functions that does:
```
echo "“n"; // “
echo mb_ord("“")."n"; // 8220
echo mb_chr(mb_ord("“")); // “
```
But why do you need to transform it back and forth? It seems you already have the character in your code :), not as a value but as the actual visual representation.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.