I am trying to replace a value in a pandas dataframe cell that I am constructing from a JSON file that I get from Airtable API the JSON file looks like this one:
[
{
"id": "rec0Y7ydscZnU5NjZ",
"createdTime": "2023-03-17T16:32:57.000Z",
"fields": {
"field_id": [
"recq6vsFWI2H8j84i"
],
"field_name": "S-166",
"currency": [
"Dolares"
],
"state": [
5
],
"roi": {
"specialValue": "Infinity"
},
"investment_return": {
"specialValue": "NaN"
},
"value_amount": 16559.51178956216,
"income": [
31978.5
]
}
}
]
Obviously this is a sample and I am replacing the real keys and some values, because they are from a private application, but what I am trying to do once I get the API response and then convert the JSON into a dataframe using the code below, is to replace the values {"specialValue": "Infinity"}
from roi key and {"specialValue": "NaN"}
from investment_return key with zeros:
#read_json.py
def read_json():
# Open the JSON file
with open('json_test.json', 'r') as file:
# Parse the JSON data
data = json.load(file)
# Extract all fields from the "fields" dictionary and create a DataFrame
df1 = pd.DataFrame([record['id'] for record in data],columns=['id'])
df2 = pd.DataFrame([record['fields'] for record in data])
df = pd.concat([df1,df2], axis=1)
return df
Then once I have the JSON data converted into a dataframe I am trying to replace the values using this code:
df.replace('{"specialValue": "Infinity"}', np.nan, inplace=True)
For some reason is not working:
Here I am using the read_json()
function I described above to generate the dataframe from the JSON file gotten from API response, and inspecting the resulting dataframe:
df = read_json()
print(df)
id field_id field_name currency state
0 rec0Y7ydscZnU5NjZ [recq6vsFWI2H8j84i] S-166 [Dolares] [5]
roi investment_return value_amount
0 {'specialValue': 'Infinity'} {'specialValue': 'NaN'} 16559.51179
income
0 [31978.5]
Then once I want to replace()
for example the string {"specialValue": "Infinity"}
by NaN
with numpy:
# replace string value with NaN
df.replace('{"specialValue": "Infinity"}', np.nan, inplace=True)
print(df)
what I would expect is to get for the column roi
an nan value, but I am still getting the same string that I receive from the API:
0 rec0Y7ydscZnU5NjZ [recq6vsFWI2H8j84i] S-166 [Dolares] [5]
roi investment_return value_amount
0 {'specialValue': 'Infinity'} {'specialValue': 'NaN'} 16559.51179
income
0 [31978.5]
Can somebody give me a possible approach on how to convert those into NaN or zeros?
2
Answers
After parsing your JSON, you no longer have strings, but objects.
In particular, you can check that:
You could take advantage of the mutability of dictionaries and of the
map
method (applymap
before pandas 2.1):Modified
df
:Easier to change the values in the dictionary before constructing the DataFrame.
For example:
Output: