skip to Main Content

I’m getting data with Facebook Insights API and there are nested columns in the data I get. I tried separating them by index but failed.

column I want to split:

[{'action_type': 'link_click', 'value': '1'}, {'action_type': 'video_view', 'value': '1'}]

the state i want to translate:

actions_video_view  actions_link_click
 1                   1
xx = dataframe['actions'].apply(pd.Series).merge(dataframe["index"],
        right_index=True,
        left_index=True).melt(id_vars=['index'],
            value_name='actions')
xx2 = xx['action_type'].apply(pd.Series).merge(xx["index"],
        right_index=True, 
        left_index=True)
xx2 = xx2.loc[xx2['action_type'] == 'video_view', ["value", "index"]]

when i run this code i get the following error:

Traceback (most recent call last):
  File "C:ProgramDataAnaconda3libsite-packagespandascoreframe.py", line 3458, in __getitem__
    indexer = self.columns.get_loc(key)
  File "C:ProgramDataAnaconda3libsite-packagespandascoreindexesbase.py", line 3363, in get_loc      
    raise KeyError(key) from err
KeyError: 'action_type'

I want to separate the column according to the keys and add it as a dataframe column, which way can I use for this?

An example of how it looks in the data:

actions
[{'action_type': 'link_click', 'value': '1'}, {'action_type': 'video_view', 'value': '1'}]
[{'action_type': 'link_click', 'value': '3'}, {'action_type': 'video_view', 'value': '3'}]
[{'action_type': 'link_click', 'value': '5'}, {'action_type': 'video_view', 'value': '5'}]
[{'action_type': 'link_click', 'value': '6'}, {'action_type': 'video_view', 'value': '6'}]
[{'action_type': 'link_click', 'value': '7'}, {'action_type': 'video_view', 'value': '7'}]

if i want to apply:

actions_link_click  actions_video_view
1                    1
3                    3
5                    5
6                    6
7                    7

2

Answers


  1. This does the job,

    grouped_df = df.groupby("action_type")
    new_df = pd.DataFrame([])
    
    for action in df["action_type"].unique():
      new_df[f"actions_{action}"] = list(grouped_df.get_group(action)["value"])
    

    Output –

    actions_link_click actions_video_view
    0 1 1
    1 3 3
    2 5 5
    3 6 6
    4 7 7
    Login or Signup to reply.
  2. I think you should have a look on how that data is generated like that to each row of a dataframe. I think that is not straightforward. But for your current problem, here is a solution:

    import pandas as pd
    import json
    
    def convert_df(col):
        tmp = col.apply(pd.Series)
        out = (tmp.assign(idx=tmp.groupby('action_type').cumcount())
        .pivot(index='idx', columns='action_type', values='value')
        .add_prefix('actions_').rename_axis(columns=None).reset_index(drop=True))
        return out
    
    rows = [[{'action_type': 'link_click', 'value': '1'}, {'action_type': 'video_view', 'value': '1'}],
    [{'action_type': 'link_click', 'value': '3'}, {'action_type': 'video_view', 'value': '3'}],
    [{'action_type': 'link_click', 'value': '5'}, {'action_type': 'video_view', 'value': '5'}],
    [{'action_type': 'link_click', 'value': '6'}, {'action_type': 'video_view', 'value': '6'}],
    [{'action_type': 'link_click', 'value': '7'}, {'action_type': 'video_view', 'value': '7'}],]
    
    
    df = pd.DataFrame({'actions' : rows})
    df = pd.json_normalize(df['actions'])
    
    df = pd.concat([
        convert_df(df[0]),
        convert_df(df[1])
    ], axis=1)
    print(df)
    
      actions_link_click actions_video_view
    0                  1                  1
    1                  3                  3
    2                  5                  5
    3                  6                  6
    4                  7                  7
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search