skip to Main Content

I would like to get a cross-table by using pandas dataframe with Python like this:

Home Away Tot
played 19 19 38
wins 10 8 18
draws 7 5 12
loss 2 6 8
goals_for 40 26 66
goals_against 17 19 36
avg_goal_for 2.1 1.4 1.7
avg_goal_against 0.9 1.0 0.9

and the json file is :

{
"get": "teams/statistics",
"response": {
    "fixtures": {
        "played": {
            "home": 19,
            "away": 19,
            "total": 38
        },
        "wins": {
            "home": 10,
            "away": 8,
            "total": 18
        },
        "draws": {
            "home": 7,
            "away": 5,
            "total": 12
        },
        "loses": {
            "home": 2,
            "away": 6,
            "total": 8
        }
    },
    "goals": {
        "for": {
            "total": {
                "home": 40,
                "away": 26,
                "total": 66
            },
            "average": {
                "home": "2.1",
                "away": "1.4",
                "total": "1.7"
            }
        },
        "against": {
            "total": {
                "home": 17,
                "away": 19,
                "total": 36
            },
            "average": {
                "home": "0.9",
                "away": "1.0",
                "total": "0.9"
            }
        }
    }

}

}

I don’t know how to handle all nested dictionaries in an easy way.
Please someone can help me out.
Thanks

2

Answers


  1. I think you first need to un-nest your dictionary, I recommend a recursive function to do so.

    if you want to do something that works for any occasion you can do something such as

    final_dict = {}
    
    def verify_keys(dicto):
        # this verify is dicto is a consistent block
        for val in dicto.values():
            keys = list(val.keys())
            equal = keys==['home', 'away', 'total']
            if not equal:
                return False
        return True
    
    def append_to_unested_dict(dicto, my_dict, key):
        # this function adds data to your dictionary
        if verify_keys(dicto):
            for name, item in dicto.items():
                my_dict[f"{key}_{name}"] = item 
            return True
        return False
    
    def deep_nested(dicto, my_dict, base_key=None):
        # this is the recursive fucntion
        for key, item in dicto.items():
            name_key = key if not base_key else f"{base_key}_{key}"
            appended = append_to_unested_dict(item, my_dict, name_key)
            if not appended:
                deep_nested(item, my_dict, key)
            
    
    for key, item in response.items():
        if not isinstance(item, dict):
            continue
        deep_nested(item, final_dict)
    
    df = pd.DataFrame(final_dict).T
    

    output

                           home     away    total
    fixtures_played         19       19     38
    fixtures_wins           10        8     18
    fixtures_draws           7        5     12
    fixtures_loses           2        6     8
    goals_for_total         40       26     66
    goals_for_average       2.1     1.4     1.7
    goals_against_total     17       19     36
    goals_against_average   0.9     1.0     0.9
    
    Login or Signup to reply.
  2. One way to do this is to process both halves of your response separately, using json_normalize to read the data, then splitting the column names to a multi-level index and then using stack to move to a long format from the wide format. You can then drop the top-level index and concat the two dataframes:

    data = json.loads(jstr)
    
    fixtures = pd.json_normalize(data['response']['fixtures'])
    fixtures.columns = fixtures.columns.str.split('.', expand=True)
    fixtures = fixtures.stack(level=0)
    fixtures.index = fixtures.index.droplevel()
    fixtures
    #         away  home  total
    # draws      5     7     12
    # loses      6     2      8
    # played    19    19     38
    # wins       8    10     18
    
    goals = pd.json_normalize(data['response']['goals'])
    goals.columns = goals.columns.str.split(r'.(?!.*.)', expand=True)
    goals = goals.stack(level=0)
    goals.index = goals.index.droplevel()
    goals
    #                 away home total
    # against.average  1.0  0.9   0.9
    # against.total     19   17    36
    # for.average      1.4  2.1   1.7
    # for.total         26   40    66
    
    out = pd.concat([fixtures, goals])
    

    Output:

                    away home total
    draws              5    7    12
    loses              6    2     8
    played            19   19    38
    wins               8   10    18
    against.average  1.0  0.9   0.9
    against.total     19   17    36
    for.average      1.4  2.1   1.7
    for.total         26   40    66
    

    Note, if desired you can rename the index column to match your question using

    out.index = ['draws', 'loses', 'played', 'wins', 'goals_against_average', 'goals_against_total', 'goals_for_average', 'goals_for_total']
    

    Output:

                          away home total
    draws                    5    7    12
    loses                    6    2     8
    played                  19   19    38
    wins                     8   10    18
    goals_against_average  1.0  0.9   0.9
    goals_against_total     19   17    36
    goals_for_average      1.4  2.1   1.7
    goals_for_total         26   40    66
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search