I have the data coming via REST api with nested json, Trying to explode the response but its flatteing in only the first level. Need to explode the nested part also.
First step im converting data into pd df then exploding each node (bdata, edata). But not able to explode the columns(topping) inside edata.
sample data:
{
"adata": {
"1": {
"xid": "012",
"xtype": "donut",
"xname": "xCake",
"xppu": 0.55
},
"2": {
"xid": "015",
"xtype": "donut",
"xname": "Cake",
"xppu": 0.565
},
"3": {
"xid": "018",
"xtype": "donut",
"xname": "Cakex",
"xppu": 0.559
}
},
"bdata": {
"1": [
{
"yid": "00012",
"ytype": "donut",
"yname": "Cake",
"yppu": 0.55
},
{
"yid": "00023",
"ytype": "donut",
"yname": "Raised",
"yppu": 0.554
},
{
"yid": "00024",
"ytype": "donut",
"yname": "Raised",
"yppu": 0.554
}
],
"2": [
{
"yid": "00015",
"ytype": "donut",
"yname": "Cake",
"yppu": 0.565
},
{
"yid": "00026",
"ytype": "donut",
"yname": "Raised",
"yppu": 0.557
},
{
"yid": "00027",
"ytype": "donut",
"yname": "Raised",
"yppu": 0.525
}
],
"3": [
{
"yid": "00018",
"ytype": "donut",
"yname": "Cake",
"yppu": 0.559
},
{
"yid": "00039",
"ytype": "donut",
"yname": "Old Fashioned",
"yppu": 0.558
}
]
},
"edata": {
"1": [
{
"eid": "03001",
"etype": "donut",
"name": "Cake",
"ppu": 0.55,
"topping": [
{"id": "51", "type": "None"},
{"id": "002", "type": "zGlazed"},
{"id": "05", "type": "Sugar"},
{"id": "5007", "type": "Powdered Sugar"},
{"id": "06", "type": "Chocolate with Sprinkles"},
{"id": "53", "type": "Chocolate"},
{"id": "04", "type": "Maple"}
]
},
{
"eid": "0302",
"etype": "donut",
"name": "Raised",
"ppu": 0.55
},
{
"eid": "0302",
"etype": "donut",
"name": "Raisedz",
"ppu": 0.55,
"topping": "None"
},
{
"eid": "03003",
"etype": "donut",
"name": "zOld Fashioned",
"ppu": 0.55,
"topping": [
{"id": "501", "type": "Nonex"},
{"id": "52", "type": "xGlazed"},
{"id": "503", "type": "Chocolatez"}
]
}
],
"2": [
{
"eid": "00401",
"etype": "donut",
"name": "Cake",
"ppu": 0.55,
"topping": [
{"id": "01", "type": "None"},
{"id": "2", "type": "xGlazed"},
{"id": "55", "type": "xSugar"},
{"id": "507", "type": "Powdered Sugar"},
{"id": "506", "type": "xChocolate with Sprinkles"},
{"id": "03", "type": "xChocolate"},
{"id": "54", "type": "xMaple"}
]
},
{
"eid": "042",
"etype": "donut",
"name": "Raised",
"ppu": 0.55
},
{
"eid": "042",
"etype": "donut",
"name": "Raisedx",
"ppu": 0.55,
"topping": "None"
}
],
"3": [
{
"eid": "051",
"etype": "donut",
"name": "Cake",
"ppu": 0.55,
"topping": [
{"id": "50407", "type": "Powdered Sugarx"},
{"id": "50406", "type": "Chocolate with Sprinklesx"},
{"id": "50403", "type": "Chocolatex"},
{"id": "50404", "type": "Maplex"}
]
},
{
"eid": "050403",
"etype": "donut",
"name": "Old Fashioned",
"ppu": 0.55,
"topping": [
{"id": "5071", "type": "None"},
{"id": "5072", "type": "Glazedx"},
{"id": "5703", "type": "Chocolatex"}
]
}
]
}
}
Im trying to get the result somewhat like below.
eid etype name ppu id type xid xtype xname xppu yid ytype yname yppu
0 03001 donut Cake 0.55 51 None 012 donut xCake 0.55 00012 donut Cake 0.55
1 03001 donut Cake 0.55 002 zGlazed 012 donut xCake 0.55 00012 donut Cake 0.55
2 03001 donut Cake 0.55 05 Sugar 012 donut xCake 0.55 00012 donut Cake 0.55
3 03001 donut Cake 0.55 5007 Powdered Sugar 012 donut xCake 0.55 00012 donut Cake 0.55
4 03001 donut Cake 0.55 06 Chocolate with Sprinkles 012 donut xCake 0.55 00012 donut Cake 0.55
5 03001 donut Cake 0.55 53 Chocolate 012 donut xCake 0.55 00012 donut Cake 0.55
6 03001 donut Cake 0.55 04 Maple 012 donut xCake 0.55 00012 donut Cake 0.55
7 0302 donut Raised 0.55 NaN None 015 donut Cake 0.565 00015 donut Cake 0.565
8 0302 donut Raisedz 0.55 None None 015 donut Cake 0.565 00015 donut Cake 0.565
2
Answers
You can extract data with
pd.json_normalize
:Output:
However, I don’t see how to merge
bdata
andedata
.Had to build a recursive iterator to get through the lists embedded within the dictionaries which is what the
drill_through
function is doing. The dictionaries with a topping were duplicated and atype
was assigned for those, otherwise it was set toNone
.