I am bit rusty in pandas to json transforms.
I have a pandas data frame like this for a fictional library:
userId | visitId | readingTime | Books | BookType |
---|---|---|---|---|
u1 | 1 | 300 | book1,book2,book3 | Fiction |
u2 | 1 | 400 | book4,book5 | Horror |
u2 | 2 | 250 | book6 | Romance |
Need to create a json that is like:
{
"visitSummary": {
"u1": [
{
"readingTime": 300,
"Books": [
"book1",
"book2",
"book3"
],
"BookType": "Fiction"
}
],
"u2": [
{
"readingTime": 400,
"Books": [
"book4",
"book5"
],
"BookType": "Horror"
},
{
"readingTime": 250,
"Books": [
"book6"
],
"BookType": "Romance"
}
]
}
}
I was thinking to do it using nested loops and processing each row. I am hoping, there is a simpler pythonic way to do it.
Using Python 3.10 and pandas 2.1.4
4
Answers
You don’t need nested loops, just one loop would do. Here’s one way to do this:
Output:
Try:
Prints:
and saves
data.json
file.With
split
,groupby
&to_dict
:Output :
Split the Books column into a list of strings using
Series.str.split()
. You can then use.groupby()
,.apply()
and.to_json()
to get the desired output:This outputs: