I have a use case where we have text file like key value format .
The file is not any of the fixed format but created like key value .
We need to create JSON out of that file .
I am able to create JSON but when text format has array like structure it creates just Key value json not the array json structure .
This is my Input .
[DOCUMENT]
Headline=This is Headline
MainLanguage=EN
DocType.MxpCode=1000
Subject[0].MxpCode=BUSNES
Subject[1].MxpCode=CONS
Subject[2].MxpCode=ECOF
Author[0].MxpCode=6VL6
Industry[0].CtbCode=53
Industry[1].CtbCode=5340
Industry[2].CtbCode=534030
Industry[3].CtbCode=53403050
Symbol[0].Name=EXPE.OQ
Symbol[1].Name=ABNB.OQ
WorldReg[0].CtbCode=G4
Country[0].CtbCode=G26
Country[1].CtbCode=G2V
[ENDOFFILE]
Exiting code to create json is below
with open("file1.csv") as f:
lines = f.readlines()
data = {}
for line in lines:
parts = line.split('=')
if len(parts) == 2:
data[parts[0].strip()] = parts[1].strip()
print(json.dumps(data, indent=' '))
The current output is below
{
"Headline": "This is Headline",
"MainLanguage": "EN",
"DocType.MxpCode": "1000",
"Subject[0].MxpCode": "BUSNES",
"Subject[1].MxpCode": "CONS",
"Subject[2].MxpCode": "ECOF",
"Author[0].MxpCode": "6VL6",
"Industry[0].CtbCode": "53",
"Industry[1].CtbCode": "5340",
"Industry[2].CtbCode": "534030",
"Industry[3].CtbCode": "53403050",
"Symbol[0].Name": "EXPE.OQ",
"Symbol[1].Name": "ABNB.OQ",
"WorldReg[0].CtbCode": "G4",
"Country[0].CtbCode": "G26",
"Country[1].CtbCode": "G2V"
}
Expected out is is something like below
For the Subject key and like wise for others also
{
"subject": [
{
"mxcode": 123
},
{
"mxcode": 123
},
{
"mxcode": 123
}
]
}
Like wise for Industry and Symbol and Country.
so the idea is when we have position in the text file it should be treated as array in the json output .
3
Answers
Use one more loop as it is nested. Use for loop from where subject starts. try it that way.
The code below is creating a dictionary, which is outputted to a JSON array.
You could condense this code to your needs.
JSON Output:
Actually the code looks like the previous answer but I followed more your expected output with every subject industry etc to be set of dictionaries.