skip to Main Content

I noticed that when using with open to read a json, using either r or rb parameters returns identical results.

with open('something.json', 'rb') as f # 'r' returns the same thing
    t1 = json.load(f)

However, when I write to a json with wb, I get an error:

with open('something.json', 'wb') as f:
  json.dump(some_dict, f)

TypeError: a bytes-like object is required, not ‘str’

But w works fine. Why is this the case?

4

Answers


  1. This happens because you are not writing in binary form over the JSON, try using json.dumps(some_dict) to transform your dictionary into a json file and then use some_dict.encode('utf-8') to write to it as binary.
    Something like this:

    json_data = json.dumps(data)
    with open('something.json', 'wb') as f:
    f.write(json_data.encode('utf-8'))
    
    Login or Signup to reply.
  2. As of Python 3.6, the json module tries to auto-detect the encoding of a binary file when reading JSON. UTF-8, UTF-16, or UTF-32 are supported.

    However, this doesn’t work when writing. There’s no way to auto-detect what encoding you wanted to use when writing. json could make an assumption, like open does when you open a file in text mode without specifying an encoding, but the tradeoffs are different.

    If json were to assume UTF-8 encoding (or any other encoding) when writing to a binary file, then you might end up reading UTF-32 and writing UTF-8 and not noticing the encoding change until some other code that needs UTF-32 breaks. That’s less of an issue with open, because open makes the same assumption when reading or writing, instead of trying to auto-detect encoding when reading.

    Since auto-detection is impossible and assuming an encoding would be error-prone, json requires you to give it a text file when writing.

    Login or Signup to reply.
  3. When opening the file in read-only binary mode rb, f.read() function returns bytes instead of string (for text mode), both of which are acceptable for json.load function argument

    When openning the file in write-only binary mode wb, f.write() function only supports byte-like objects as argument instead of str. so when json.dump calls it and trying to pass the string value to f.write, it will raise the exception you saw above

    Login or Signup to reply.
  4. Since json.load and json.loads are given data to inspect, they try to guess an encoding for the data when given binary. But it can’t do that when dumping data because there isn’t anything to inspect. So, there is a lack of symmetry. You can give a binary file to the reader, risking that it will guess encoding incorrectly. But not the writer. It will only work with strings and you are expected to deal with encoding on your own.

    Personally, I wouldn’t trust the encoding guesser and would always open in "r" mode with an explicit encoding.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search