skip to Main Content

Hey guys I’ve been working on converting some json text I’m receiving from an API and I noticed some people using json_normalize, but in my case it doesn’t solve the full issue and I was wondering if someone could help.
import pandas as pd

my_json = [
    {
        "total": "null",
        "items": [
            {
                "key": "time",
                "label": "Time",
                "value": "2022-12-13T23:59:59.939-07:00"
            },
            {
                "key": "agentNotes",
                "label": "Agent Notes",
                "value": "null"
            },
            {
                "key": "blindTransferToAgent",
                "label": "Blind Transfer To Agent",
                "value": "0"
            }]},
  {"total": "null",
        "items": [
            {
                "key": "time",
                "label": "Time",
                "value": "2022-12-13T23:59:59.939-07:00"
            },
            {
                "key": "agentNotes",
                "label": "Agent Notes",
                "value": "null"
            },
            {
                "key": "blindTransferToAgent",
                "label": "Blind Transfer To Agent",
                "value": "0"
            }
        ]}]
df = pd.json_normalize(my_json, ["items"])
print(df)

This gives me a result like this

               key  ...                          value
0                  time  ...  2022-12-13T23:59:59.939-07:00
1            agentNotes  ...                           null
2  blindTransferToAgent  ...                              0
[3 rows x 3 columns]

But I’m trying to create my keys as columns and the values as the values so the end result look like this.

time agentNotes blindTransfertoAgent

2022-12-13T23:590:59.939-07:00 null 0

Any help would be appreciated.

2

Answers


  1. Try changing this:

    df = pd.json_normalize(my_json, ["items"])
    

    into this:

    df = pd.json_normalize(my_json, ["items"]).T
    

    The T attribute in a Pandas DataFrame object stores a transposition of the index and columns, which is what you’re looking for.

    Output:

                                       0            1                        2
    key                             time   agentNotes     blindTransferToAgent
    label                           Time  Agent Notes  Blind Transfer To Agent
    value  2022-12-13T23:59:59.939-07:00         null                        0
    
    Login or Signup to reply.
  2. I did not find any shortcuts for this problem and maybe someone could enlighten us.
    However the solution isn’t that long, so I thought to post it anyways.

    Your "JSON" isn’t really a JSON if I am reading correctly from your question, it is a list that contains a dictionary with two keys, total and items. The value of items is a list of dictionaries, so we can iterate through the values and take the keyvalue elements from each one:

    from collections import defaultdict
    import pandas as pd
    
    dict_to_df = defaultdict(list)
    dictionaries = [inner_dicts for items_dict in my_json for inner_dicts in items_dict['items']]
    
    for dictionary in dictionaries:
        dict_to_df[dictionary['key']].append(dictionary['value'])
    
    df = pd.DataFrame.from_dict(dict_to_df, orient='index').T
    print(df)
    

    Which outputs:

                                time agentNotes blindTransferToAgent
    0  2022-12-13T23:59:59.939-07:00       null                    0
    

    Explanations:
    • Initialize an empty defaultdict (with default value of a list) which we will read to a pandas dataframe.
    • Insert the values per key in the "JSON" we have
    • Read the dictionary into a pandas DataFrame – orienting the indices and transposing, if these values don’t have the same length. (such as having one more blindTransferToAgent with value 1) to get around if the JSON looks like:
            {
                "key": "time",
                "label": "Time",
                "value": "2022-12-13T23:59:59.939-07:00"
            },
            {
                "key": "agentNotes",
                "label": "Agent Notes",
                "value": "null"
            },
            {
                "key": "blindTransferToAgent",
                "label": "Blind Transfer To Agent",
                "value": "0"
            },
            {
                "key": "blindTransferToAgent",
                "label": "Blind Transfer To Agent",
                "value": "4"
            }
    

    Which will output:

                                time agentNotes blindTransferToAgent
    0  2022-12-13T23:59:59.939-07:00       null                    0
    1                           None       None                    4
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search