I have a JSON that I download from a website that has multiple nested dictionaries inside the main list. This is a very simplified version of it.
[
{
"id": 1,
"attributes": {
"autoNumber": 1,
"make": "Ford",
"model": "F150",
"trim": "Lariat"
},
"engine": {
"data": [
{
"id": 1,
"attributes": {
"engine": "5.0l v8 ",
"horsePower": "400",
"torque": "410"
}
},
{
"id": 2,
"attributes": {
"engine": "2.7l v6 ",
"horsePower": "325",
"torque": "300"
}
}
]
}
}
]
This is the code I use to convert to xlsx
import json
import pandas as pd
# Load JSON data
with open('data.json') as json_file:
data = json.load(json_file)
# Normalize JSON data to tabular format
df = pd.json_normalize(data)
df.to_excel('data.xlsx', index=False)
I want all of the various attributes to be in its own column. The engine data may be zero to two entries and should go into one each. The issue I am having is that it is putting all of the engine data into one column.
column f = [{'id': 1, 'attributes': {'engine': '5.0l v8 ', 'horsePower': '400', 'torque': '410'}}, {'id': 2, 'attributes': {'engine': '2.7l v6 ', 'horsePower': '325', 'torque': '300'}}]
I tried dataframe explode, but it created additional rows instead of columns.
2
Answers
I suggest to create two dataframes from the data and then merge it together based on the
id
, e.g.:Prints:
You can use pd.json_normalize()
Code:
Output: