skip to Main Content
Example

{"data":"value1","version":"value2","version1":"value3"}
{"data":"value1","version1":"value3"}
{"data":"value1","version1":"value3","hi":{"a":"true,"b":"false"}}

I have a JSON file and need to convert it to csv, however the rows are not having same columns, and some rows have nested attributes,how to convert them in python script.

I tried JSON to csv using Python code, but it gives me an error

2

Answers


  1. In order to convert a JSON file to a CSV file in Python, you will need to use the Pandas library.

    import pandas as pd
    
    data = [
       {
         "data": "value1",
         "version": "value2",
         "version1": "value3"
       },
       {
         "data": "value1",
         "version1": "value3"
       },
       {
         "data": "value1",
         "version1": "value3",
         "hi": {
           "a": "true,",
           "b": "false"
         }
       }
    ]
    
    df = pd.DataFrame(data)
    df.to_csv('data.csv', index=False)
    

    I have correctly formatted your JSON since it was giving errors.

    Login or Signup to reply.
  2. You could convert the JSON data to a flat list of lists with column names on the first line. Then process that to make the CSV output.

    def flatDict(D,p=""):
        if not isinstance(D,dict):
            return {"":D}
        return {p+k+s:v for k,d in D.items() for s,v in flatDict(d,".").items()}
        
    def flatData(data):
        lines = [*map(flatDict,data)]
        names = dict.fromkeys(k for d in lines for k in d)
        return [[*names]] + [ [*map(line.get,names)] for line in lines ]
                        
    

    The flatDict function converts a nested dictionary structure to a single level dictionary with nested keys combined and brought up to the top level. This is done recursively so that it works for any depth of nesting

    The flatData function processes each line, to make a list of flattened dictionaries (lines). The union of all keys in that list forms the list of columns names (using a dictionary constructor to get them in order of appearance). The list of names and lines is returned by converting each dictionary to a list mapping key names to line data where present (using the .get() method of dictionaries).

    output:

    E = [{"data":"value1","version":"value2","version1":"value3"},
    {"data":"value1","version1":"value3"},
    {"data":"value1","version1":"value3","hi":{"a":"true","b":"false"}} ]
    
    for line in flatData(E):
        print(line)
    
    ['data',   'version', 'version1', 'hi.a', 'hi.b']    # col names
    ['value1', 'value2',  'value3',    None,   None]     # data ...
    ['value1',  None,     'value3',    None,   None]
    ['value1',  None,     'value3',   'true', 'false']
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search