I have a DataFrame with these columns: id, jan_x, jan_y, feb_x, feb_y, ..., dec_x, dec_y
that I would like export as a json that is structured like this:
{
"id1": [
[jan_x, feb_x, ..., dec_x],
[jan_y, feb_y, ..., dec_y]
],
"id2": [
[jan_x, feb_x, ..., dec_x],
[jan_y, feb_y, ..., dec_y]
]
}
The initial keys e.g., id1
correspond to an id in column id
of my dataframe. Without any custom parsing function is there a straight forward functional way of achieving this? I have tried dumping it as a json but the desired list structure isn’t captured.
Here is a sample data frame with just two months.
data = {'id': ['1', '2', '3', '4'],
'jan_x': [1, 2, 3, 4],
'jan_y': [5, 6, 7, 8],
'feb_x': [9, 10, 11 12],
'feb_y': [13 14, 15, 16]}
df = pd.DataFrame(data)
Sample Output:
{
"1": [
[1, 9],
[5, 13]
],
"2": [
[2, 10],
[6, 14]
],
"3": [
[3, 11],
[7, 15]
],
"4": [
[4, 12],
[8, 16]
]
}
2
Answers
Update:
Use
to_dict('list')
after aggregate by common suffix (‘x’, ‘y’, …):Output:
I am not sure what you mean by "any custom parsing function", but here is my attempt.
The solution uses
iterrows
go iterate through rows. For each row, I take the columns that ends with _x usingx_columns
variable that I defined above, and the same for columns that ends with _y.row[x_columns]
gives you the values as a pandas Series, therefore you need to convert it to a list before adding to fit your desired format. I append all those to a result variable and print the result variable at the end.Here is my example input:
and here is the output: