Question posted in Json
Our archive of expertly curated questions and answers provides insights and solutions to common problems related to this popular data interchange format. From parsing and manipulating JSON data to integrating it with various programming languages and web services, our archive has got you covered. Start exploring today and take your JSON skills to the next level

How do I make my JSON file readable in my pandas dataframe?

Lilcodemuffin
October 31, 2023
196 views
0 votes
2 Answers

Here is my code:

with open(r'unique_columns.json', 'r') as f:
    config = json.load(f)

unique_col_comb = config['Unique_Column_Combination']['TABLE_NAME']

df = pd.read_csv(f's3://path/to/file.csv', sep='|')
df_unique = df.set_index([unique_col_comb]).index.is_unique
print(df_unique)

My JSON looks like this:

{
    "Unique_Column_Combination":
    {
        "TABLE_NAME": "COL1, COL2, COL3"
    }

}

I get the error:

KeyError: "None of ['COL1, COL2, COL3'] are in the columns"

But when I actually write out the columns in df_unique, the code works:

df_unique = df.set_index(['COL1', 'COL2', 'COL3']).index.is_unique
>>True

I think I need to add an extra quote (") to the end of each column name in my JSON file but then it won’t be in proper JSON format. Can I add it in the python code? Or do I need to convert my JSON dictionary to a python list?

Answers

- anonymous
- October 31, 2023 at 4:17 pm
- 0 votes
0
"COL1, COL2, COL3" is a single string. You should probably split it into three column names with unique_col_comb.split()

Login or Signup to reply.

- DL
- October 31, 2023 at 4:33 pm
- 0 votes
0
The json does not decompose into what would be a dataframe.

Here is an example of json that can be read:
```
import pandas as pd
import json

json_string = """
{
  "employees": [
    {
      "name": "John Doe",
      "email": "[email protected]",
      "phone": "555-555-5555"
    },
    {
      "name": "Jane Doe",
      "email": "[email protected]", 
      "phone": "444-444-4444"
    }
  ]
}
"""

j = json.loads(json_string)

df = pd.DataFrame(j['employees'])
print(df)
```
I didn’t mention df.read_json() in the example because this part worked and it is the format of the json that failed.

In your example this json would work if square brackets added for example:
```
{
    "Unique_Column_Combination":[
    {
        "TABLE_NAME": "COL1, COL2, COL3"
    }
    ]
}
```
to give this:
```
         TABLE_NAME
0  COL1, COL2, COL3
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.