The data is a long list of dictionaries from a JSON file. Each dictionary has the same keys but different values of multiple types, and sometimes these values are null. I need to know the type of each value so i can initialize the appropriate variables elsewhere.
An example of data would be like:
[{"Name": "null, "Age": 23, "Wage": 16.5},
{"Name": "jason", "Age": null, "Wage": 22.5},
{"Name": "blake", "Age": null, "Wage": 23.8},
{"Name": null, "Age": 26, "Wage": null}]
And im trying to get the resulting types of each which would be
<string, int, float>.
Since the JSON can often be 100,000+ different elements opposed to the 4 in the example, I was not sure if it makes sense to just do something like iterating until all types are determined or if there was a more efficient way. I am currently working in both python and c++.
2
Answers
So to start with working with JSON in python you want to import the json library and you can convert a string to json this will automatically handle datatype conversions. If you are using something like requests library for requesting your data you can use the
.json()
method like demoed in the link below.https://www.geeksforgeeks.org/response-json-python-requests/#
As for determining data types in python you can use the built in method
type()
. You can find all the built in datatypes below.https://www.w3schools.com/python/python_datatypes.asp
I’m using pandas until it doesn’t work.
Or you can use polars, which will return
100,000 rows should be able to be handled in these tools pretty easily.
You probably want to read your file using the library’s read function, rather than read the JSON to python list of dicts. Try
If you’re using newline delimited JSON then you can do a lazy read which should (?) avoid any memory concerns for large files.