skip to Main Content

I would like to turn a json file into a dataframe, but I get the error:
ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.
Can anyone help me?

The json file is as follows:

{
    "testCaseResult": {
        "timestamp": 1696257561,
        "testCaseStatus": "Failed",
        "result": "Found nullCount=2. It should be 0",
        "testResultValue": [
            {
                "name": "nullCount",
                "value": "2"
            }
        ],
        "testCaseFailureStatus": {
            "testCaseFailureStatusType": "New",
            "updatedAt": 1696257581779
        }
    }
}

the python code I used is as follows

import pandas as pd
f = open(fjson )
data = json.load(f)
f.close()
for i in data['testCaseResult']:
    print(i)
testCaseResult_df = pd.DataFrame(data['testCaseResult'])

4

Answers


  1. Your json contains nested structure. You have to flatten the JSON object before converting to dataframe.

    import pandas as pd
    import json
    
    fjson = 'C:/dati/json/json_file.json'
    with open(fjson) as f:
        data = json.load(f)
        print(data)
    
    flat_data = {}
    for key, value in data['testCaseResult'].items():
        if isinstance(value, dict):
            for nested_key, nested_value in value.items():
                flat_data[f'{key}_{nested_key}'] = nested_value
        elif isinstance(value, list):
            for i, dict_ in enumerate(value):
                for nested_key, nested_value in dict_.items():
                    flat_data[f'{key}_{i}_{nested_key}'] = nested_value
        else:
            flat_data[key] = value
    
    testcaseresult_df = pd.DataFrame([flat_data]) # {'testCaseResult': {'timestamp': 1696257561, 'testCaseStatus': 'Failed', 'result': 'Found nullCount=2. It should be 0', 'testResultValue': [{'name': 'nullCount', 'value': '2'}], 'testCaseFailureStatus': {'testCaseFailureStatusType': 'New', 'updatedAt': 1696257581779}}}
    
    
    Login or Signup to reply.
  2. You can use json_normalize:

    df = pd.json_normalize(data['testCaseResult'], 'testResultValue', 
                           ['timestamp','testCaseStatus','result',
                           ['testCaseFailureStatus','testCaseFailureStatusType'],
                           ['testCaseFailureStatus','updatedAt']])
    print (df)
            name value   timestamp testCaseStatus  
    0  nullCount     2  1696257561         Failed   
    
                                  result  
    0  Found nullCount=2. It should be 0   
    
      testCaseFailureStatus.testCaseFailureStatusType  
    0                                             New   
    
      testCaseFailureStatus.updatedAt  
    0                   1696257581779  
    
    Login or Signup to reply.
  3. You have to think about what is Dataframe.
    Dataframe of Pandas is for tabular data consisting of rows and columns.

    The json attached by you seems not proper without preprocessing.

    ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.
    

    This error comes from the array in your json.
    In my thingking, it is reasonable for the error to occur unless you modify the data.

    Login or Signup to reply.
  4. The error you’re encountering typically occurs when you try to convert a nested dictionary structure into a Pandas DataFrame directly. To convert this JSON data into a DataFrame, you need to flatten the nested structure. You can use the code:

    import pandas as pd
    import json
    
    # Your JSON data
    data = {
        "testCaseResult": {
            "timestamp": 1696257561,
            "testCaseStatus": "Failed",
            "result": "Found nullCount=2. It should be 0",
            "testResultValue": [
                {
                    "name": "nullCount",
                    "value": "2"
                }
            ],
            "testCaseFailureStatus": {
                "testCaseFailureStatusType": "New",
                "updatedAt": 1696257581779
            }
        }
    }
    
    # Flatten the nested JSON into a DataFrame
    def flatten_json(json_obj, parent_key='', separator='_'):
        items = {}
        for key, value in json_obj.items():
            new_key = parent_key + separator + key if parent_key else key
            if isinstance(value, dict):
                items.update(flatten_json(value, new_key, separator=separator))
            else:
                items[new_key] = value
        return items
    
    flat_data = flatten_json(data)
    
    # Convert to DataFrame
    testCaseResult_df = pd.DataFrame([flat_data])
    
    # Print the DataFrame
    print(testCaseResult_df)
    

    Here we recursively flatten the nested JSON structure and converting the flattened data into a Pandas DataFrame.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search