skip to Main Content

I’m getting some facebook posts that have a mixture of English and and a non-English language (Khmer to be exact).

Here’s how the non-English is displayed when I print the data to screen or save it to file: u178au17c2u179bu1787u17b6u17a2u17d2. I would rather have it display as ឈឹម បញ្ចពណ៌ (Note: this is not a translation of the previous unicode.)

3

Answers


  1. Chosen as BEST ANSWER

    In pycharm I added:

    1. (at top) # -- coding: utf-8 --

    2. import sys reload(sys) sys.setdefaultencoding('utf8')

    3. s = json.dumps(posts['data'],ensure_ascii=False)
    4. json_file.write(s.decode('utf-8'))

  2. This should be it:

    print(u'u1787u17b6u17a2u17d2') #python3
    print u'u1787u17b6u17a2u17d2'  #python2.7
    

    Output: ជាអ្

    Login or Signup to reply.
  3. Try this if you want to save the info in a file:

    import codecs
    
    string = 'ឈឹម បញ្ចពណ៌'
    with codecs.open('yourfile', 'w', encoding='utf-8') as f:
        f.write(string)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search