Converting JSON list with multiple nested dictionaries to csv or excel

TxHemi
July 20, 2024
62 views
0 votes
2 Answers

I have a JSON that I download from a website that has multiple nested dictionaries inside the main list. This is a very simplified version of it.

[
    {
        "id": 1,
        "attributes": {
            "autoNumber": 1,
            "make": "Ford",
            "model": "F150",
            "trim": "Lariat"
            },
            "engine": {
                "data": [
                    {
                        "id": 1,
                        "attributes": {
                            "engine": "5.0l v8 ",
                            "horsePower": "400",
                            "torque": "410"
                        }
                    },
                    {
                        "id": 2,
                        "attributes": {
                            "engine": "2.7l v6 ",
                            "horsePower": "325",
                            "torque": "300"
                        }
                    }
                ]
            }
    }
]

This is the code I use to convert to xlsx

import json
import pandas as pd

# Load JSON data
with open('data.json') as json_file:
    data = json.load(json_file)

# Normalize JSON data to tabular format
df = pd.json_normalize(data)

df.to_excel('data.xlsx', index=False)

I want all of the various attributes to be in its own column. The engine data may be zero to two entries and should go into one each. The issue I am having is that it is putting all of the engine data into one column.

column f = [{'id': 1, 'attributes': {'engine': '5.0l v8 ', 'horsePower': '400', 'torque': '410'}}, {'id': 2, 'attributes': {'engine': '2.7l v6 ', 'horsePower': '325', 'torque': '300'}}]

I tried dataframe explode, but it created additional rows instead of columns.

Answers

I suggest to create two dataframes from the data and then merge it together based on the id, e.g.:

import pandas as pd

data = [
    {
        "id": 1,
        "attributes": {
            "autoNumber": 1,
            "make": "Ford",
            "model": "F150",
            "trim": "Lariat",
        },
        "engine": {
            "data": [
                {
                    "id": 1,
                    "attributes": {
                        "engine": "5.0l v8 ",
                        "horsePower": "400",
                        "torque": "410",
                    },
                },
                {
                    "id": 2,
                    "attributes": {
                        "engine": "2.7l v6 ",
                        "horsePower": "325",
                        "torque": "300",
                    },
                },
            ]
        },
    }
]

df1 = []
for d in data:
    df1.append({"id": d["id"], **d["attributes"]})

df2 = []
for d in data:
    for engine_data in d["engine"]["data"]:
        df2.append(
            {
                "id": d["id"],
                "attribute_id": engine_data["id"],
                **engine_data["attributes"],
            }
        )

df1 = pd.DataFrame(df1)
df2 = pd.DataFrame(df2)

df_out = df1.merge(df2, on="id", how="outer")
print(df_out)

Prints:

   id  autoNumber  make model    trim  attribute_id    engine horsePower torque
0   1           1  Ford  F150  Lariat             1  5.0l v8         400    410
1   1           1  Ford  F150  Lariat             2  2.7l v6         325    300

You can use pd.json_normalize()

Code:

import json

import pandas as pd


with open("data.json") as json_file:
    data = json.load(fp=json_file)

df = pd.json_normalize(data=data,
                       record_path=["engine", "data"],
                       meta=["attributes", "id"],
                       meta_prefix="meta_"
                       )
df = (pd
      .concat(objs=[df, df.pop(item="meta_attributes").apply(func=pd.Series)], axis=1)
      .rename(columns=lambda x: x.split(".")[-1])
      )
print(df)

Output:

  id    engine horsePower torque meta_id  autoNumber  make model    trim
0   1  5.0l v8         400    410       1           1  Ford  F150  Lariat
1   2  2.7l v6         325    300       1           1  Ford  F150  Lariat

Please signup or login to give your own answer.

Click here to cancel reply.