I have the following dataframe:
2023-06-30 2022-06-30 2021-06-30 2020-06-30
Ordinary Shares Number 7432000000.0 7464000000.0 7519000000.0 7571000000.0
Share Issued 7432000000.0 7464000000.0 7519000000.0 7571000000.0
Net Debt 12533000000.0 35850000000.0 43922000000.0 49751000000.0
Total Debt 59965000000.0 61270000000.0 67775000000.0 70998000000.0
Tangible Book Value 128971000000.0 87720000000.0 84477000000.0 67915000000.0
... ... ... ... ...
Cash Cash Equivalents And Short Term Investments 111262000000.0 104757000000.0 130334000000.0 136527000000.0
Other Short Term Investments 76558000000.0 90826000000.0 116110000000.0 122951000000.0
Cash And Cash Equivalents 34704000000.0 13931000000.0 14224000000.0 13576000000.0
Cash Equivalents 26226000000.0 5673000000.0 6952000000.0 NaN
Cash Financial 8478000000.0 8258000000.0 7272000000.0 NaN
[73 rows x 4 columns]
And I am trying to convert it to json in the following format:
{
"2023-06-30": {
"Ordinary Shares Number": "7432000000.0",
...
},
"2022-06-30": {
"Ordinary Shares Number": "7464000000.0",
...
},
"2021-06-30": {
"Ordinary Shares Number": "7519000000.0",
...
},
"2020-06-30": {
"Ordinary Shares Number": "7571000000.0",
...
}
}
However my attempts to convert it have ranged from bad to worse so I really have no idea what I’m doing.
My attempts are either giving my a very undesirable json format or a type error about the timestamps:
For example:
out = json.dumps({c: dict(zip(balance.index, balance[c])) for c in balance.columns}, indent=4)
print(out)
Results in:
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/jesse_b/tools/stonk-db/stonkdb/__main__.py", line 39, in <module>
main()
File "/home/jesse_b/tools/stonk-db/stonkdb/__main__.py", line 32, in main
out = json.dumps({c: dict(zip(balance.index, balance[c])) for c in balance.columns}, indent=4)
File "/usr/lib/python3.8/json/__init__.py", line 234, in dumps
return cls(
File "/usr/lib/python3.8/json/encoder.py", line 201, in encode
chunks = list(chunks)
File "/usr/lib/python3.8/json/encoder.py", line 431, in _iterencode
yield from _iterencode_dict(o, _current_indent_level)
File "/usr/lib/python3.8/json/encoder.py", line 376, in _iterencode_dict
raise TypeError(f'keys must be str, int, float, bool or None, '
TypeError: keys must be str, int, float, bool or None, not Timestamp
2
Answers
You can use dict-comprehension:
Prints:
We can achieve the same JSON structure by setting the first column as the index of the DataFrame and then converting it to JSON using
df.to_json(orient='columns')
. In the'columns'
orientation, the JSON takes the shape where each column in the DataFrame becomes a key, and the corresponding values are stored in an array under that key.Prints