skip to Main Content

I am attempting to make an API call. For this specific API, one of the keys in the JSON file needs to have a nested dictionary inside of it.

here is the input data in dataframe format:

ID        Date        Total_Transactions   Amount    Account_Name__c
1234567   2022-12-21  1                    235.00    a1234567
2345678   2022-13-21  2                    300.50    a2345678

The end result needs to look like this with a key "Account_Name__r" outside of the nested dictionary:

[{'ID': '1234567',
  'Date': '2022-12-21',
  'Total_Transactions': 1,
  'Amount': 235.00,
  'Account_Name__r': {'Account_Name__c':'a1234567'}},
{'ID': '2345678',
  'Date': '2022-13-21',
  'Total_Transactions': 2,
  'Amount': 300.50,
  'Account_Name__r': {'Account_Name__c':'a2345678'}}]

The data is coming from a data frame. I can get a normal data frame to export properly, but having issues with the nesting.
Here’s what it looks like when I do the normal dataframe as a normal json:

code:

final.to_json(orient='records')

output:

[{'ID': '1234567',
  'Date': '2022-12-21',
  'Total_Transactions': 1,
  'Amount': 235.00,
  'Account_Name__c':'a1234567'},
{'ID': '2345678',
  'Date': '2022-13-21',
  'Total_Transactions': 2,
  'Amount': 300.50,
  'Account_Name__c':'a2345678'}]

Any ideas how i need to structure my dataframe and what transformations/functions I need to use to get the nested structure I have at the top? I am looking to achieve this by performing vectorized operations in pandas and by using the df.to_json() method in pandas.

I am not looking for a for loop solution, that is easy but does not actually help me learn how to create different kinds of complex JSON structures from a pandas dataframe and in my case is not scalable for the large datasets I’ll be using.

3

Answers


  1. Chosen as BEST ANSWER

    I found the answer by breaking this down into a smaller problem to solve. I posted the question here: Is there a way to store a dictionary on each row of a dataframe column using a vectorized operation?

    User Panda Kim gets credit for solving the initial problem: https://stackoverflow.com/users/20430449/panda-kim

    This is the solution using the answer that Panda Kim uses along with the final step needed that I pieced together.

    First, we name a new column named for the key we'll use later outside of the wrapped dictionary and we'll get the values for the column by transposing the column name 'Account_Name__c' and it's corresponding value using the .T method, and setting it to a dictionary using to_dict()

    final_insert['Account_Name__r'] = pd.Series(final_insert[['Account_Name__c']].T.to_dict())
    

    the result:

    ID      Date       Total_Transactions Account_Name__r
    1234567 2022-12-21 1                  {'Account_Name__c':'a1234567'}
    

    Finally, we then transform the entire dataframe to a dictionary or a json using either .to_dict() or .to_json()

    final_insert = final_insert.to_dict(orient='records')
    

    The result:

    [{'ID': '1234567',
      'Date': '2022-12-21',
      'Total_Transactions': 1,
      'Amount': 235.00,
      'Account_Name__r': {'Account_Name__c':'a1234567'}}]
    

  2. Try this:

    data=[{'ID': '1234567',
      'Date': '2022-12-21',
      'Total_Transactions': 1,
      'Amount': 235.00,
      'Account_Name__c':'a1234567'},
    {'ID': '2345678',
      'Date': '2022-13-21',
      'Total_Transactions': 2,
      'Amount': 300.50,
      'Account_Name__c':'a2345678'}]
    
    df=pd.DataFrame(data)
    
    
    df["Account_Name__r"]=df["Account_Name__c"].apply(lambda x: {"Account_Name__c":x})
    
    df.drop(columns=["Account_Name__c"],inplace=True)
    
    print(df.to_json(orient='records'))
    
    Login or Signup to reply.
  3. Try:

    out = df.to_dict(orient="records")
    
    for d in out:
        d["ID"] = str(d["ID"])
        d["Account_Name__r"] = {"Account_Name__c": d.pop("Account_Name__c")}
    
    print(out)
    

    Prints:

    [
        {
            "ID": "1234567",
            "Date": "2022-12-21",
            "Total_Transactions": 1,
            "Amount": 235.0,
            "Account_Name__r": {"Account_Name__c": "a1234567"},
        },
        {
            "ID": "2345678",
            "Date": "2022-13-21",
            "Total_Transactions": 2,
            "Amount": 300.5,
            "Account_Name__r": {"Account_Name__c": "a2345678"},
        },
    ]
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search