I have a json output which looks like this.
{'pagination': {'limit': 100, 'offset': 0, 'count': 38, 'total': 38},
'data': [{'name': 'Ceco Environmental Corp',
'symbol': 'CECE',
'has_intraday': False,
'has_eod': True,
'country': None,
'stock_exchange': {'name': 'NASDAQ Stock Exchange',
'acronym': 'NASDAQ',
'mic': 'XNAS',
'country': 'USA',
'country_code': 'US',
'city': 'New York',
'website': 'www.nasdaq.com'}},
{'name': 'CEC CoreCast Corporation Ltd',
'symbol': '600764.XSHG',
'has_intraday': False,
'has_eod': True,
'country': None,
'stock_exchange': {'name': 'Shanghai Stock Exchange',
'acronym': 'SSE',
'mic': 'XSHG',
'country': 'China',
'country_code': 'CN',
'city': 'Shanghai',
'website': 'www.sse.com.cn'}},
{'name': 'CECEP WindPower Corp',
'symbol': '601016.XSHG',
'has_intraday': False,
'has_eod': True,
'country': None,
'stock_exchange': {'name': 'Shanghai Stock Exchange',
'acronym': 'SSE',
'mic': 'XSHG',
'country': 'China',
'country_code': 'CN',
'city': 'Shanghai',
'website': 'www.sse.com.cn'}},
{'name': 'CECONOMY AG INHABER-STAMMAKTIEN O.N.',
'symbol': 'CEC.XSTU',
'has_intraday': False,
'has_eod': True,
'country': None,
'stock_exchange': {'name': 'Börse Stuttgart',
'acronym': 'XSTU',
'mic': 'XSTU',
'country': 'Germany',
'country_code': 'DE',
'city': 'Stuttgart',
'website': 'www.boerse-stuttgart.de'}},
{'name': 'CECONOMY AG ST O.N.',
'symbol': 'CEC.XFRA',
'has_intraday': False,
'has_eod': True,
'country': None,
'stock_exchange': {'name': 'Deutsche Börse',
'acronym': 'FSX',
'mic': 'XFRA',
'country': 'Germany',
'country_code': 'DE',
'city': 'Frankfurt',
'website': 'www.deutsche-boerse.com'}},
{'name': 'CECONOMY AG ST O.N.',
'symbol': 'CEC.XETRA',
'has_intraday': False,
'has_eod': True,
'country': None,
'stock_exchange': {'name': 'Deutsche Börse Xetra',
'acronym': 'XETR',
'mic': 'XETRA',
'country': 'Germany',
'country_code': 'DE',
'city': 'Frankfurt',
'website': ''}},
{'name': 'CECEP COSTIN',
'symbol': '2228.XHKG',
'has_intraday': False,
'has_eod': True,
'country': None,
'stock_exchange': {'name': 'Hong Kong Stock Exchange',
'acronym': 'HKEX',
'mic': 'XHKG',
'country': 'Hong Kong',
'country_code': 'HK',
'city': 'Hong Kong',
'website': 'www.hkex.com.hk'}},
.....
I am trying to load it into a dataframe and filter the stock_exchange
column by country.
Here is my code.
import pandas as pd
data = api_result.json()
result = pd.DataFrame(data['data'])
result[result['stock_exchange'].str.contains('China')]
But I get the following error, KeyError: "None of [Float64Index([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,n nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,n nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan],n dtype='float64')] are in the [columns]"
However, if I save the output to a csv, and reload it back into a dataframe like this,
result.to_csv('result.csv')
result = pd.read_csv('result.csv')
result[result['stock_exchange'].str.contains('China')]
I get the filtered dataframe like this,
Unnamed: 0 name symbol has_intraday has_eod country stock_exchange
1 1 CEC CoreCast Corporation Ltd 600764.XSHG False True NaN {'name': 'Shanghai Stock Exchange', 'acronym':...
2 2 CECEP WindPower Corp 601016.XSHG False True NaN {'name': 'Shanghai Stock Exchange', 'acronym':...
Any idea why I can’t filter the dataframe without saving the frame to csv and reloading first?
2
Answers
Just use
json_normalize()
I should note that each cell ofstock_exchange
is adict
object not astr
object, which is part of the problem because you will always get NaN and not a bool so you cannot use boolean indexing like you currently are doing – i.e.,result[result['stock_exchange'].str.contains('China')]
You must create your dataframe again.
Use from this code