Consider the following text:
sample_text = "The fox's color was u201Cbrownu201D and it’s speed was quick"
Notice that there is a regular single quote in "fox’s" and a right single quote in "it’s"
So my purpose is to get the original text representation of those encoded characters in sample_text, but not able to do so completely.
I did the following:
>>> sample_text.encode().decode('unicode-escape')
"The fox's color was "brown" and itâx80x99s speed was quick"
Now my question is, is there any way I could get the original right single quote after decoding that sample_text . With my code’s output, you can see that it’s giving me itâx80x99s instead.
I want it to be: it’s
Edit: As suggested in the comments, I’m adding the output of print(sample_text)
print(sample_text)
output: The fox's color was u201Cbrownu201D and it’s speed was quick
Edit: I’m using python 3.8.10 and Ubuntu
2
Answers
According to your post and your edits this should work for you:
To avoid confusion, I have to add that this is not working for me, but it’s giving me this:
(I’m using python 3.10.6 in Ubuntu 22.04.2 in WSL2 right now)
But since the color was output correctly in your code sample
it should work for you.
Read about
unicode-escape
in Python Specific Encodings (my emphasizing):Hence,
.encode().decode('unicode_escape')
causes a mojibake case as follows:Solution in the following code; :
Linux:
Windows: