It should be a simple line of code using pd.json_normalize function but it’s working only with a single string and it’s not batch processing my whole column
Orginial dataframe
df['addresses'][0]
[{'addressLine1': '124 Main Street',
'addressLine2': '',
'addressLine3': '',
'city': 'Portland',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}]
test = pd.json_normalize(result['addresses'][0])
test
Everything up to this point works, but when I use the function and apply to the whole column, the resulting dataframe turned out to look like this.
test = pd.json_normalize(result['addresses'])
test
Here are some column data:
[[{'addressLine1': '124 Main Street',
'addressLine2': '',
'addressLine3': '',
'city': 'Portland',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}],
[{'addressLine1': '1234 Main Street',
'addressLine2': '',
'addressLine3': '',
'city': 'Chattanooga',
'region': 'TN',
'postalCode': '37402',
'country': 'USA'}],
[{'addressLine1': '1684151 Chair Street',
'addressLine2': '',
'addressLine3': '',
'city': 'Notaplace',
'region': 'AL',
'postalCode': '48835',
'country': 'USA'}],
[{'addressLine1': '136 Main Street',
'addressLine2': '',
'addressLine3': '',
'city': 'Portland',
'region': 'ME',
'postalCode': '22118',
'country': 'USA'}],
[{'addressLine1': '123452 HoneyDo LN',
'addressLine2': '',
'addressLine3': '',
'city': 'Portland',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}],
[{'addressLine1': '123 Main Street',
'addressLine2': 'Apt 2B',
'addressLine3': 'Building B',
'city': 'Portland',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}],
[{'addressLine1': '123 Main Street',
'addressLine2': 'Apt 2B',
'addressLine3': 'Building B',
'city': 'New York City',
'region': 'NY',
'postalCode': '10001',
'country': 'USA'}],
[{'addressLine1': '123 Main Street',
'addressLine2': 'Apt 2B',
'addressLine3': 'Building B',
'city': 'Portland',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}],
[{'addressLine1': '4578 Shiver Me Timbers Road',
'addressLine2': '',
'addressLine3': '',
'city': 'Portland',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}],
[{'addressLine1': '124 Main ST',
'addressLine2': '',
'addressLine3': '',
'city': 'PORTLAND',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}]]
2
Answers
If I understand you correctly, you can transform your dataframe
df
withdict
data with following example:Prints:
It seems your list has one-element lists as elements.
Lets say your list is
address_list
then you get the first element in that list and then usejson_normalize
If the test data that you posted is actually a column then just use:
Or if you have other columns in addition to
addresses
in yourresult
dataframe: