Question posted in Json
Our archive of expertly curated questions and answers provides insights and solutions to common problems related to this popular data interchange format. From parsing and manipulating JSON data to integrating it with various programming languages and web services, our archive has got you covered. Start exploring today and take your JSON skills to the next level

convert file json in pandas dataframe

MarioPellegrini
October 3, 2023
85 views
3 votes
4 Answers

I would like to turn a json file into a dataframe, but I get the error:
ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.
Can anyone help me?

The json file is as follows:

{
    "testCaseResult": {
        "timestamp": 1696257561,
        "testCaseStatus": "Failed",
        "result": "Found nullCount=2. It should be 0",
        "testResultValue": [
            {
                "name": "nullCount",
                "value": "2"
            }
        ],
        "testCaseFailureStatus": {
            "testCaseFailureStatusType": "New",
            "updatedAt": 1696257581779
        }
    }
}

the python code I used is as follows

import pandas as pd
f = open(fjson )
data = json.load(f)
f.close()
for i in data['testCaseResult']:
    print(i)
testCaseResult_df = pd.DataFrame(data['testCaseResult'])

Answers

Your json contains nested structure. You have to flatten the JSON object before converting to dataframe.

import pandas as pd
import json

fjson = 'C:/dati/json/json_file.json'
with open(fjson) as f:
    data = json.load(f)
    print(data)

flat_data = {}
for key, value in data['testCaseResult'].items():
    if isinstance(value, dict):
        for nested_key, nested_value in value.items():
            flat_data[f'{key}_{nested_key}'] = nested_value
    elif isinstance(value, list):
        for i, dict_ in enumerate(value):
            for nested_key, nested_value in dict_.items():
                flat_data[f'{key}_{i}_{nested_key}'] = nested_value
    else:
        flat_data[key] = value

testcaseresult_df = pd.DataFrame([flat_data]) # {'testCaseResult': {'timestamp': 1696257561, 'testCaseStatus': 'Failed', 'result': 'Found nullCount=2. It should be 0', 'testResultValue': [{'name': 'nullCount', 'value': '2'}], 'testCaseFailureStatus': {'testCaseFailureStatusType': 'New', 'updatedAt': 1696257581779}}}

You can use json_normalize:

df = pd.json_normalize(data['testCaseResult'], 'testResultValue', 
                       ['timestamp','testCaseStatus','result',
                       ['testCaseFailureStatus','testCaseFailureStatusType'],
                       ['testCaseFailureStatus','updatedAt']])
print (df)
        name value   timestamp testCaseStatus  
0  nullCount     2  1696257561         Failed   

                              result  
0  Found nullCount=2. It should be 0   

  testCaseFailureStatus.testCaseFailureStatusType  
0                                             New   

  testCaseFailureStatus.updatedAt  
0                   1696257581779

- gwanhun
- October 3, 2023 at 1:19 pm
- 0 votes
0
You have to think about what is Dataframe.
Dataframe of Pandas is for tabular data consisting of rows and columns.

The json attached by you seems not proper without preprocessing.
```
ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.
```
This error comes from the array in your json.
In my thingking, it is reasonable for the error to occur unless you modify the data.
Login or Signup to reply.

The error you’re encountering typically occurs when you try to convert a nested dictionary structure into a Pandas DataFrame directly. To convert this JSON data into a DataFrame, you need to flatten the nested structure. You can use the code:

import pandas as pd
import json

# Your JSON data
data = {
    "testCaseResult": {
        "timestamp": 1696257561,
        "testCaseStatus": "Failed",
        "result": "Found nullCount=2. It should be 0",
        "testResultValue": [
            {
                "name": "nullCount",
                "value": "2"
            }
        ],
        "testCaseFailureStatus": {
            "testCaseFailureStatusType": "New",
            "updatedAt": 1696257581779
        }
    }
}

# Flatten the nested JSON into a DataFrame
def flatten_json(json_obj, parent_key='', separator='_'):
    items = {}
    for key, value in json_obj.items():
        new_key = parent_key + separator + key if parent_key else key
        if isinstance(value, dict):
            items.update(flatten_json(value, new_key, separator=separator))
        else:
            items[new_key] = value
    return items

flat_data = flatten_json(data)

# Convert to DataFrame
testCaseResult_df = pd.DataFrame([flat_data])

# Print the DataFrame
print(testCaseResult_df)

Here we recursively flatten the nested JSON structure and converting the flattened data into a Pandas DataFrame.

Please signup or login to give your own answer.

Click here to cancel reply.