I have the following nested dictionary with lists inside:
[{
"id": 467,
"status": 2,
"leavePeriod": {
"owner": {
"employeeNumber": "2620",
"firstName": "fn_467",
"lastName": "ln_467"
},
"ownerId": 46,
"leaves": [
{
"leaveAccount": {
"id": 1121,
"name": "Vacation days 2021/2022",
"url": "https://some_link/1121"
},
"date": "2023-04-06T00:00:00"
},
{
"leaveAccount": {
"id": 1121,
"name": "Vacation days 2021/2022",
"url": "https://some_link/1121"
},
"date": "2023-04-06T00:00:00"
},
{
"leaveAccount": {
"id": 1121,
"name": "Vacation days 2021/2022",
"url": "https://some_link/1121"
},
"date": "2023-04-07T00:00:00"
},
{
"leaveAccount": {
"id": 1121,
"name": "Vacation days 2021/2022",
"url": "https://some_link/1121"
},
"date": "2023-04-07T00:00:00"
},
{
"leaveAccount": {
"id": 1121,
"name": "Vacation days 2021/2022",
"url": "https://some_link/1121"
},
"date": "2023-04-11T00:00:00"
},
{
"leaveAccount": {
"id": 1121,
"name": "Vacation days 2021/2022",
"url": "https://some_link/1121"
},
"date": "2023-04-11T00:00:00"
}
]
}
},
{
"id": 477,
"status": 2,
"leavePeriod": {
"owner": {
"employeeNumber": "2522",
"firstName": "fn_477",
"lastName": "lm_477"
},
"ownerId": 41,
"leaves": [
{
"leaveAccount": {
"id": 1121,
"name": "Vacation days 2021/2022",
"url": "https://some_link/1121"
},
"date": "2023-03-13T00:00:00"
},
{
"leaveAccount": {
"id": 1121,
"name": "Vacation days 2021/2022",
"url": "https://some_link/1121"
},
"date": "2023-03-13T00:00:00"
},
{
"leaveAccount": {
"id": 1323,
"name": "RTT 2023",
"url": "https://some_link/1323"
},
"date": "2023-03-14T00:00:00"
},
{
"leaveAccount": {
"id": 1323,
"name": "RTT 2023",
"url": "https://some_link/1323"
},
"date": "2023-03-14T00:00:00"
},
{
"leaveAccount": {
"id": 1323,
"name": "RTT 2023",
"url": "https://some_link/1323"
},
"date": "2023-03-15T00:00:00"
},
{
"leaveAccount": {
"id": 1323,
"name": "RTT 2023",
"url": "https://some_link/1323"
},
"date": "2023-03-15T00:00:00"
},
{
"leaveAccount": {
"id": 1323,
"name": "RTT 2023",
"url": "https://some_link/1323"
},
"date": "2023-03-16T00:00:00"
},
{
"leaveAccount": {
"id": 1323,
"name": "RTT 2023",
"url": "https://some_link/1323"
},
"date": "2023-03-16T00:00:00"
},
{
"leaveAccount": {
"id": 1323,
"name": "RTT 2023",
"url": "https://some_link/1323"
},
"date": "2023-03-17T00:00:00"
},
{
"leaveAccount": {
"id": 1323,
"name": "RTT 2023",
"url": "https://some_link/1323"
},
"date": "2023-03-17T00:00:00"
}
]
}
}]
How can I flatten it and convert it to a dataframe using this library?
from pandas.io.json import json_normalize
I’d like this to be done in one step if possible, instead of flattening to a dataframe twice and then merging them
Thanks in advance.
2
Answers
Just use
pd.json_normalize(<put_json_here>)
.Using
pandas.json_normalize
Data
Option 1
Or if you’re lazy and don’t wanna split all the meta columns up by hand…
Parameter Descriptions + Explanations
record_path
: str or list of str, default NonePath in each object to list of records. If not passed, data will be assumed to be an array of records.
meta
: list of paths (str or list of str), default NoneFields to use as metadata for each record in resulting table.
Option 2
To "flatten" the
leavePeriod.leaves
even further…Note: you can’t see all the columns in the image because it’s too wide so I have two images