I am using the Facebook API (v2.10) to which I’ve extracted the data I need, 95% of which is perfect. My problem is the ‘actions‘ metric which returns as a dictionary within a list within another dictionary.
At present, all the data is in a DataFrame, however, the ‘actions’ column is a list of dictionaries that contain each individual action for that day.
{
"actions": [
{
"action_type": "offsite_conversion.custom.xxxxxxxxxxx",
"value": "7"
},
{
"action_type": "offsite_conversion.custom.xxxxxxxxxxx",
"value": "3"
},
{
"action_type": "offsite_conversion.custom.xxxxxxxxxxx",
"value": "144"
},
{
"action_type": "offsite_conversion.custom.xxxxxxxxxxx",
"value": "34"
}]}
All this appears in one cell (row) within the DataFrame.
What is the best way to:
- Get the action type, create a new column and use the Use “action_type” as the column name?
- List the correct value under this column
It looks like JSON but when I look at the type, it’s a panda series (stored as an object).
For those willing to help (thank you, I greatly appreciate it) – can you either point me in the direction of the right material and I will read it and work it out on my own (I’m not entirely sure what to look for) or if you decide this is an easy problem, explain to me how and why you solved it this way. Don’t just want the answer
I have tried the following (with help from a friend) and it kind of works, but I have issues with this running in my script. IE: if it runs within a bigger code block, I get the following error:
for i in range(df.shape[0]):
line = df.loc[i, 'Conversions']
L = ast.literal_eval(line)
for l in L:
cid = l['action_type']
value = l['value']
df.loc[i, cid] = value
If I save the DF as a csv, call it using pd.read_csv
…it executes properly, but not within the script. No idea why.
Error:
ValueError: malformed node or string: [{'value': '1', 'action_type': 'offsite_conversion.custom.xxxxx}]
Any help would be greatly appreciated.
Thanks,
Adrian
2
Answers
You can use
json_normalize
:You can use
df.join(pd.DataFrame(df['Conversions'].tolist()).pivot(columns='action_type', values='value').reset_index(drop=True))
.Explanation:
df['Conversions'].tolist()
returns a list of dictionaries. This list is then transformed into a DataFrame using pd.DataFrame. Then, you can use the pivot function to pivot the table into the shape that you want.Lastly, you can join the table with your original DataFrame. Note that this only works if you DataFrame’s index is the default (i.e., integers starting from 0). If this is not the case, you can do this instead:
df2 = pd.DataFrame(df['Conversions'].tolist()).pivot(columns='action_type', values='value').reset_index(drop=True)
for col in df2.columns:
df[col] = df2[col]