I’m trying to separate some data into a metadata section and the actual measurement taken. Each file has multiple measurements.
The .json file comes out like this.
{
"version": "1",
"data": [
{ ##This starts the first point
"metadata1": 1.1, ##metadata parts
"metadata2": 2.1, #
"metadata3": 3.1, #
"metadata4": 4.1, #
"metadata5": 5.5, #
"measurements": [ ##This is the part that I can't get out
{ #
"datax": 0, #This is the first x-value of the first measurement
"datay": 10 #This is the first y-value of the first measurement
},
{
"datax": .3, #This is the second x-value of the first measurement
"datay": 15 #This is the second y-value of the first measurement
},
{
"datax": .7,
"datay": 25
},
{
"datax": 1.1,
"datay": 40
},
{
"datax": 1.7,
"datay": 55
},
]
},
{ ##This starts the second point
"metadata1": 1.2, ##metadata parts
"metadata2": 2.2, #
"metadata3": 3.2, #
"metadata4": 4.4, #
"metadata5": 5.5, #
"measurements": [ ##This is the part that I can't get out
{ #
"datax": 1, #This is the first x-value of the second measurement
"datay": 20 #This is the first y-value of the second measurement
},
{
"datax": 2.3,
"datay": 35
},
{
"datax": 3.7,
"datay": 25
},
{
"datax": 4.1,
"datay": 30
},
{
"datax": 5.7,
"datay": 32
},
]
}
]
}
I can get the metadata out easily but only the first measurement x and y value get grabbed and stuck into the same slot. Any attempt to get the data out of the measurements results in errors. I think the biggest problem is all the x and y values are labeled the same, which throws everything off. I want to group all the first measuements datax and datay into lists, and the second and so on measuements datax and datay into lists.
with open('Filename.json') as f:
data = json.load(f)
df=pd.DataFrame(data['data'])
display(df)
metadata1 | metadata2 | metadata3 | metadata4 | metadata5 | measurements |
---|---|---|---|---|---|
1.1 | 2.1 | 3.1 | 4.1 | 5.1 | [{‘datax’: 0, ‘datay’: 10}] |
1.2 | 2.2 | 3.2 | 4.2 | 5.2 | [{‘datax’: 1, ‘datay’: 20}] |
I’ve tried reading the file measurements in a similar way but it only gives KeyError: ‘measurements’
with open('Filename.json') as a:
measurements = json.load(a)
df1=pd.DataFrame(data['measurements'])
What I want is to get all the metadata into one list for each aspect of the metadata which is currently working.
What I now need is to separate the data into:
- datax into a list of datax at point 1,
- datay into a list of datay at point 1,
- datax into a list of datax at point 2,
- datay into a list of datay at point 2,
- etc.
The metadata will always be in the same spot in each file but the measurements section can be a variable length.
I’m still new to this and never worked with json files before so forgive me if this is a simple solution. Also feel free to ask questions if I didn’t explain something well.
2
Answers
IIUC you can do:
Prints:
OR: looking at your comment, maybe:
Prints:
Ignoring the comments in the example JSON, there are still a few minor syntax and formatting issues. However, once you fix those you can access the
keys
in your JSON-turned-DataFrame like so:Each
metadata
item is available in the dataframe as a separate key as well, e.g.: