In Python I have to keep dumping to a storage file an array of user data consisting of JSON objects (simple format: [{}, {}…]). As the program captures and records user data, the array [] will increase potentially into the small thousands of objects. Each object {} consists of up to about fifty key_value pairs.
How would I write a custom function for json.dump() using the native "default=" kwarg that will start each object on a new line, and add an extra linespace between? Using the "indent=" kwarg to pretty-print the array makes a file that is too long vertically, and without that kwarg I get a dense mass of objects hard to read.
To illustrate the problem, take a much shorter array.
example = [{"a":1, "b":2, "c":3,}, {"d":1, "e":2, "f":3,}, {"g":1, "h":2, "i":3}, {"j":6, "k":7, "l":8}]
Without using indent, the output to my .json file is too compact.
[{"a": 1, "b": 2, "c": 3}, {"d": 1, "e": 2, "f": 3}, {"g": 1, "h": 2, "i": 3}, {"j": 6, "k": 7, "l": 8}]
If I use indent, I get a narrow vertical column.
[
{
"a": 1,
"b": 2,
"c": 3
},
{
"d": 1,
"e": 2,
"f": 3
},
{
"g": 1,
"h": 2,
"i": 3
},
{
"j": 6,
"k": 7,
"l": 8
}
]
To minimize whitespace while maintaining readability, I want output in this format (here arranged manually).
[
{"a": 1, "b": 2, "c": 3},
{"d": 1, "e": 2, "f": 3},
{"g": 1, "h": 2, "i": 3},
{"j": 6, "k": 7, "l": 8}
]
When each object holds up to fifty key:value pairs, this format, even with wrapping, will achieve both ends. I’m surprised the method provides no such option.
But I can’t find enough info on the nuts and bolts of JSON code to impose newlines and extra linespaces within the existing json.dump() method. Any help?
3
Answers
You can use
json
module withdumps()
:You can also
write()
it only once:If your top-level object is an array, consider the JSON Lines (JSONL) format instead. It is literally one JSON object per line with no [] on the outside. The advantage is that you can append to JSONL easily without rewriting the whole file, which is great for efficiency – calling
dump
to dump an ever-expanding array will incur linear slowdown per dump (overall quadratic time!), whereas appending to a JSON Lines file is constant-time (overall linear time for all records).To append new objects to a JSON Lines file:
To read all records from a JSON Lines file:
you can’t.
default
would only apply to values not to "rows", and even then your values are serializable