I am relatively new to JSON and after going through several tutorials I have decided to reach out here for additional advice. I am trying to convert the following JSON dictionary via Jupyternotebook/Pandas into a dataframe and later into a csv file (data is extracted from a public API).
Example JSON dictionary without indentation:
json_search_result =
{'inactive_count': 0, 'links': {'self': '/company/08892216/officers'}, 'etag': '1ccc4c74454bc359a9b9ef686d11e70cc03b6cb8', 'kind': 'officer-list', 'resigned_count': 0, 'total_results': 1, 'active_count': 1, 'items_per_page': 35, 'items': [{'date_of_birth': {'year': 1964, 'month': 2}, 'links': {'self': '/company/08892216/appointments/Ao1-pIsFQHy0FVSv26y6ep4Qhyw', 'officer': {'appointments': '/officers/JpJgO2pKVmGq5VHD7j7lluLW-jo/appointments'}}, 'officer_role': 'director', 'address': {'premises': '19', 'locality': 'London', 'country': 'United Kingdom', 'address_line_1': 'Norcott Road', 'postal_code': 'N16 7EJ'}, 'appointed_on': '2014-02-12', 'nationality': 'Irish', 'country_of_residence': 'England', 'name': 'FLOOD, Eileen', 'occupation': 'Company Director'}], 'start_index': 0}
With indentation for better visibility:
json_search_result =
{'inactive_count': 0, 'links': {'self': '/company/08892216/officers'},
'etag': '1ccc4c74454bc359a9b9ef686d11e70cc03b6cb8', 'kind': 'officer-list',
'resigned_count': 0, 'total_results': 1, 'active_count': 1, 'items_per_page': 35, 'items':
[{'date_of_birth': {'year': 1964, 'month': 2}, 'links': {'self': '/company/08892216/appointments/Ao1-pIsFQHy0FVSv26y6ep4Qhyw', 'officer': {'appointments':
'/officers/JpJgO2pKVmGq5VHD7j7lluLW-jo/appointments'}}, 'officer_role': 'director', 'address': {'premises': '19', 'locality': 'London', 'country': 'United Kingdom',
'address_line_1': 'Norcott Road', 'postal_code': 'N16 7EJ'},
'appointed_on': '2014-02-12', 'nationality': 'Irish', 'country_of_residence': 'England',
'name': 'FLOOD, Eileen', 'occupation': 'Company Director'}], 'start_index': 0}
I have tried among others json_normalize but I either get errors given this is a dictionary:
df2 = pd.json_normalize(json_search_result)
Thank you for any advice on this issues!
2
Answers
Please try this if this helps your problem.
Good reference for further improvements: Convert list of dictionaries to a pandas DataFrame
I assume that you will also want to flatten your dataframe since it contains additional nested information. So start by transforming you json string to a dataframe:
which gives:
Now, this function will flatten ANY dataframe with nested json strings:
Applying it:
will result in:
As you can see, the nested fields have been made to columns: