skip to Main Content

I have a dataframe which contains nested columns as below when converted in to json format.

{"id" : 1,
 "address":
  {"adress1":"",
   "city" : "TX",
   "zip": ""
  },
 "gender": "",
  "code": 999
}

The dataframe is normalized and will be written in to snowflake variant column. I would like to replace empty values("") with nulls before it inserted in to snowflake.

I tried using df.replace('',np.NAN, inPlace='True). But it replaced only gender with null. I am not able to see address1 and zip value with nulls. They are still empty. Can someone help how we can handle this scenario.

2

Answers


  1. Try this approach:

    str_response = json.dumps(response)
    df = pd.json_normalize(json.loads(str_response))
    df.replace('',np.NAN, inplace=True)
    print(df)
    

       id  gender  code  address.address1 address.city  address.zip
    0   1     NaN   999               NaN           TX          NaN
    
    Login or Signup to reply.
  2. for your particular case you can use for loops to replace '' with np.nan

    You can use nested loops to filtrate inside 2 level of json. If you have more levels of nested you can use more forloops in try block:

    import numpy as np
    
    j={"id" : 1,
       "address":{"adress1":"","city" : "TX","zip": ""},
       "gender": "",
       "code": 999}
    
    for x in j:
        if j[x]=='':
            j[x]=np.nan
        try:
            for y in j[x]:
                if j[x][y]=='':
                    j[x][y]=np.nan
        except:
            pass
    
    print(j)
    #{'id': 1,
     'address': {'adress1': nan, 'city': 'TX', 'zip': nan},
     'gender': nan,
     'code': 999}
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search