Question posted in Json
Our archive of expertly curated questions and answers provides insights and solutions to common problems related to this popular data interchange format. From parsing and manipulating JSON data to integrating it with various programming languages and web services, our archive has got you covered. Start exploring today and take your JSON skills to the next level

Loop through multiple items with the same name in a JSON and assign to variable Python

kiestuthridge23
December 9, 2022
382 views
0 votes
2 Answers

I’m trying to loop through a JSON that has multiple object keys with the same name. I need to grab them and be able to differentiate them accordingly. Below is my JSON, I need to extract the expenses, date_posted & description from each object within the "results" list:

{
    "count": 1,
    "next": null,
    "previous": null,
    "results": [
        {
            "expenses": "1920000.00",
            "dt_posted": "2022-10-20T21:53:30-04:00",
            "lobbying_activities": [
                {
                    "description": "Providing information related to Apple Pay",
                },
                {
                    "description": "Issues related to transparency and government access to data, including H.R. 7072/S. 4373, the Non-Disclosure Order (NDO) Fairness Act",
                },
              ]
        },
        {
            "expenses": "178888.00",
            "dt_posted": "2022-10-15T21:53:30-04:00",
            "lobbying_activities": [
                {
                    "description": "Issues related tothe requirements of E.O. 14028, an Executive Order on Improving the Nation's Cybersecurity Issues related to cybersecurity requirements in H.R. 7900, the National Defense Authorization Act for Fiscal Year 2023",
                },
              ]
        },
    ]
}

My code attempts to loop through the JSON and extract them into a list so i can then proceed further. However, I’m getting th error "unhashable type: 'dict'":

import requests
import pandas as pd

url = "https://api.npoint.io/1cae29b5fc8900f6cc5a"

r = requests.get(url)

df = pd.json_normalize(r.json())

z =[]

for x in df['results']:
    if df['results'][x] == 'expenses':
        z.append(x)
for x in df['results']:
    if df['results'][x] == 'dt_posted':
        z.append(x)
for x in df['results']:
    if df['results'][x] == 'description':
        z.append(x)

My ideal output should hold one dataset containing the first "expenses", "dt_posted" and "description" and then the second dataset holding the "expenses", "dt_posted" and "description" from the second object inside the JSON.

Answers

- PawePietraszko
- December 9, 2022 at 11:34 am
- 0 votes
0
1. read json obj.
2. oterate over "results"
3. get single result obj with 3rd param as first obj from list under lobbying_activities key:
```
def json_result_to_row(result: dict)->tuple:
    return result["expenses"], result["dt_posted"], result["lobbying_activities"][0]
```
Login or Signup to reply.

You should specify record_path as argument for json_normalize:

df = pd.json_normalize(r.json(), record_path="results")

This will give you a readable df:

     expenses                  dt_posted                                lobbying_activities
0  1920000.00  2022-10-20T21:53:30-04:00  [{'description': 'Providing information relate...
1   178888.00  2022-10-15T21:53:30-04:00  [{'description': 'Issues related tothe require...

Edit: at this point you can export your df to json:

df.to_json(indent=4, orient="records")

Output:

[
    {
        "expenses":"1920000.00",
        "dt_posted":"2022-10-20T21:53:30-04:00",
        "lobbying_activities":[
            {
                "description":"Providing information related to Apple Pay"
            },
            {
                "description":"Issues related to transparency and government access to data, including H.R. 7072/S. 4373, the Non-Disclosure Order (NDO) Fairness Act"
            }
        ]
    },
    {
        "expenses":"178888.00",
        "dt_posted":"2022-10-15T21:53:30-04:00",
        "lobbying_activities":[
            {
                "description":"Issues related tothe requirements of E.O. 14028, an Executive Order on Improving the Nation's Cybersecurity Issues related to cybersecurity requirements in H.R. 7900, the National Defense Authorization Act for Fiscal Year 2023"
            }
        ]
    }
]

Now since you can have more than one description per row, you can explode the "lobbying_activities" column and concat the first two columns with a new df made of "lobbying_activities":

df = df.explode("lobbying_activities").reset_index()
df = pd.concat([
    df[["expenses", "dt_posted"]],
    pd.DataFrame(df["lobbying_activities"].values.tolist())
    ], axis=1)

Output:

     expenses                  dt_posted                                        description
0  1920000.00  2022-10-20T21:53:30-04:00         Providing information related to Apple Pay
1  1920000.00  2022-10-20T21:53:30-04:00  Issues related to transparency and government ...
2   178888.00  2022-10-15T21:53:30-04:00  Issues related tothe requirements of E.O. 1402...

Please signup or login to give your own answer.

Click here to cancel reply.