Today I have a challenge at my school, it is convert a CSV file to a JSON file.
This CSV has a table structure (which means contains the information from an oracle table in this example). So i have to convert this CSV file into a hierarchical JSON.
My CSV example contain the next information:
TABLE_NAME,TIPO_CARGA,COLUMN_NAME,DATA_TYPE
CUENTAS,Full load,PERFIL,NVARCHAR2
CUENTAS,Full load,LLAMADA_INI,DATE
CUENTAS,Full load,LLAMADA_END,DATE
CUENTAS_PLAN,Delta,REGISTRO,NVARCHAR2
CUENTAS_PLAN,Delta,ULTIMA_VISITA,DATE
CUENTAS_PLAN,Delta,FECHA,DATE
CUENTAS_TACTICAS,Delta,CIUDAD,NVARCHAR2
CUENTAS_TACTICAS,Delta,SALIDA,NVARCHAR2
CUENTAS_TACTICAS,Delta,COMPLETADO,DATE
CUENTAS_MIEMBROS,Full load,NEGOCIO,NVARCHAR2
CUENTAS_MIEMBROS,Full load,CORE,NVARCHAR2
CUENTAS_MIEMBROS,Full load,DESC,NVARCHAR2
and I start with python using some libraries like pandas and also json, the next code is this:
import pandas as pd
import json
df = pd.read_csv("list_tablas_short.csv")
print(df.head())
thisisjson = df.to_json(orient='records')
print('CSV to JSON:n', thisisjson)
thisisjson_dict = json.loads(thisisjson)
with open('data.json', 'w') as json_file:
json.dump(thisisjson_dict, json_file)
And the result is:
[
{
"TABLE_NAME;"COLUMN_NAME";"DATA_TYPE"": "CUENTAS;PERFIL;NVARCHAR2"
},
{
"TABLE_NAME;"COLUMN_NAME";"DATA_TYPE"": "CUENTAS;LLAMADA_INI;DATE"
},
{...}
]
This is nto the way i’m looking for.
The idea is get something like this:
{
"tablas": [
{
"nombre": "CUENTAS",
"load_type": "Full load"
"fields": [
{
"campo": "PERFIL",
"tipo": "NVARCHAR2"
},
{
"campo": "LLAMADA_INI",
"tipo": "DATE"
},
{
"campo": "LLAMADA_END",
"tipo": "DATE"
}
]
},
{
"nombre": "CUENTAS_PLAN",
"load_type": "Delta"
"fields": [
{
"campo": "REGISTRO",
"tipo": "NVARCHAR2"
},
{
"campo": "ULTIMA_VISITA",
"tipo": "DATE"
},
{
"campo": "FECHA",
"tipo": "DATE"
}
]
}
]
}
Can somebody help me with this?
2
Answers
Use list comprehension with
DataFrame.rename
andDataFrame.to_dict
:If need also
tablas
:For final
json
use:Using a dictionary comprehension with
groupby
andto_dict
:Variant:
Output: