skip to Main Content

I have an excel file with a structure as the following:

Title List
Title_1 [‘str_1’, ‘str_2’]
Title_2 [‘str_3’, ‘str_4’]

and I want to get the data in a json structure as:

{"0":{"Title": "Title_1", "List": ['str_1', 'str_2']}, "1":{"Title": "Title_2", "List": ['str_3', 'str_4']}}

instead of:

{"0":{"Title": "Title_1", "List": "['str_1', 'str_2']"}, "1":{"Title": "Title_2", "List": "['str_3', 'str_4']"}}

How can I achieve this by using the pandas module in python?

I have tried:

df = pd.read_excel("my_excel.xlsx")

df.to_dict("index")

and get:

{"0":{"Title": "Title_1", "List": "['str_1', 'str_2']"}, "1":{"Title": "Title_2", "List": "['str_3', 'str_4']"}}

2

Answers


  1. To get your desired datatype you can try the following transformation:

    import ast
    def convert_to_list(string_value):
    try:
        return ast.literal_eval(string_value)
    except (ValueError, SyntaxError):
        return string_value
    
    df['List'] = df['List'].apply(convert_to_list)
    

    This function simply takes a string value and do the following:

    • Safely tries to convert the input string as a Python literal expression(a list)
    • If the input string can be converted to a list, it will return a list otherwise in case there’s a ValueError or SyntaxError it will return the original string value
    Login or Signup to reply.
  2. If you want to use the pandas library, you can use the Dataframe method to_json. This will give you the JSON structure of the dataframe.

    df.to_json(orient="index")
    

    To validate the result you can use:

    pd.DataFrame.from_dict(json.loads(df.to_json(orient="index")), orient="index")
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search