skip to Main Content

Hi I have json string contain 3 nested json arrays and i need to split those 3 arrays into json records in python

I try below code to :

the three nested array is :
questionGroups‘ which contain another nested json array ‘questions‘ which contain another nested json array ‘answerOptions

import json
import pandas as pd

json_data = """
{
    "id": "1", "name": "test1", "modifiedDate": "2023-10-17T16:20:09.091Z",
    "questionGroups": 
    [
        {"id": "12",
         "name": "Welcome", 
         "type": "questionGroup", 
         "manualWeight": false, 
        "questions": 
        [
            {"id": "123", "text": "Respond with welcome", "helpText": "", "type": "multipleChoiceQuestion", "naEnabled": true, 
            "answerOptions": 
                    [{"id": "12345", "text": "0", "value": 0}, {"id": "847y59", "text": "1", "value": 2}],
                    "isKill": false, "isCritical": false},
            {"id": "234", "text": "Respond with hello", "helpText": "", "type": "ChoiceQuestion", "naEnabled": true,
            "answerOptions": [{"id": "12345", "text": "0", "value": 0}, {"id": "sdkmfkj", "text": "1", "value": 2}], 
            "isKill": false, "isCritical": false}
        ]
        },
        {"id": "13",
        "name": "hi", 
        "type": "questionGroup", 
        "manualWeight": false, 
       "questions": 
       [
           {"id": "33123", "text": "Respond with welcome", "helpText": "", "type": "multipleChoiceQuestion", "naEnabled": true, 
           "answerOptions": 
                   [{"id": "12345", "text": "0", "value": 0}, {"id": "847y59", "text": "1", "value": 2}],
                   "isKill": false, "isCritical": false},
           {"id": "67234", "text": "Respond with hello", "helpText": "", "type": "ChoiceQuestion", "naEnabled": true,
           "answerOptions": [{"id": "12345", "text": "0", "value": 0}, {"id": "sdkmfkj", "text": "1", "value": 2}], 
           "isKill": false, "isCritical": false}
       ]
       }
    ]
}
"""


data = json.loads(json_data)

print(pd.json_normalize(data, record_path=['questionGroups'],meta=[
    'id', 'name','modifiedDate'],record_prefix='questionGroups_').explode('questionGroups_questions').to_json(orient='records', lines=True))

the output is only 4 records ( it only split 2 arrays (‘questionGroups’ and ‘questions’) but ‘answerOptions’ did not split into reocrds ):

{"questionGroups_id":"12","questionGroups_name":"Welcome","questionGroups_type":"questionGroup","questionGroups_manualWeight":false,"questionGroups_questions":{"id":"123","text":"Respond with welcome","helpText":"","type":"multipleChoiceQuestion","naEnabled":true,"answerOptions":[{"id":"12345","text":"0","value":0},{"id":"847y59","text":"1","value":2}],"isKill":false,"isCritical":false},"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z"}
{"questionGroups_id":"12","questionGroups_name":"Welcome","questionGroups_type":"questionGroup","questionGroups_manualWeight":false,"questionGroups_questions":{"id":"234","text":"Respond with hello","helpText":"","type":"ChoiceQuestion","naEnabled":true,"answerOptions":[{"id":"12345","text":"0","value":0},{"id":"sdkmfkj","text":"1","value":2}],"isKill":false,"isCritical":false},"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z"}
{"questionGroups_id":"13","questionGroups_name":"hi","questionGroups_type":"questionGroup","questionGroups_manualWeight":false,"questionGroups_questions":{"id":"33123","text":"Respond with welcome","helpText":"","type":"multipleChoiceQuestion","naEnabled":true,"answerOptions":[{"id":"12345","text":"0","value":0},{"id":"847y59","text":"1","value":2}],"isKill":false,"isCritical":false},"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z"}
{"questionGroups_id":"13","questionGroups_name":"hi","questionGroups_type":"questionGroup","questionGroups_manualWeight":false,"questionGroups_questions":{"id":"67234","text":"Respond with hello","helpText":"","type":"ChoiceQuestion","naEnabled":true,"answerOptions":[{"id":"12345","text":"0","value":0},{"id":"sdkmfkj","text":"1","value":2}],"isKill":false,"isCritical":false},"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z"}

i want to convert the 3 nested list to raws in json format but it only split the 2 arrays (‘questionGroups’ and ‘questions’) and for ‘answerOptions’ did not split into records

so what i excepted is the output to be 8 records but the output is only 4 records because the ‘answerOptions’ did not split into records

is there are a way to split also the inner array ‘answerOptions‘ ? or only i can split the top tow nested json array ?

the excepted output is 8 records like below :

    {"questionGroups_questions_id":"123","questionGroups_questions_text":"Respond with welcome","questionGroups_questions_helpText":"","questionGroups_questions_type":"multipleChoiceQuestion","questionGroups_questions_naEnabled":true,"questionGroups_questions_answerOptions":{"id":"12345","text":"0","value":0},"questionGroups_questions_isKill":false,"questionGroups_questions_isCritical":false,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"12","questionGroups.name":"Welcome","questionGroups.type":"questionGroup","questionGroups_manualWeight":false}
{"questionGroups_questions_id":"123","questionGroups_questions_text":"Respond with welcome","questionGroups_questions_helpText":"","questionGroups_questions_type":"multipleChoiceQuestion","questionGroups_questions_naEnabled":true,"questionGroups_questions_answerOptions":{"id":"847y59","text":"1","value":2},"questionGroups_questions_isKill":false,"questionGroups_questions_isCritical":false,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"12","questionGroups.name":"Welcome","questionGroups.type":"questionGroup","questionGroups_manualWeight":false}
{"questionGroups_questions_id":"234","questionGroups_questions_text":"Respond with hello","questionGroups_questions_helpText":"","questionGroups_questions_type":"ChoiceQuestion","questionGroups_questions_naEnabled":true,"questionGroups_questions_answerOptions":{"id":"12345","text":"0","value":0},"questionGroups_questions_isKill":false,"questionGroups_questions_isCritical":false,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"12","questionGroups.name":"Welcome","questionGroups.type":"questionGroup","questionGroups_manualWeight":false}
{"questionGroups_questions_id":"234","questionGroups_questions_text":"Respond with hello","questionGroups_questions_helpText":"","questionGroups_questions_type":"ChoiceQuestion","questionGroups_questions_naEnabled":true,"questionGroups_questions_answerOptions":{"id":"sdkmfkj","text":"1","value":2},"questionGroups_questions_isKill":false,"questionGroups_questions_isCritical":false,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"12","questionGroups.name":"Welcome","questionGroups.type":"questionGroup","questionGroups_manualWeight":false}
{"questionGroups_questions_id":"33123","questionGroups_questions_text":"Respond with welcome","questionGroups_questions_helpText":"","questionGroups_questions_type":"multipleChoiceQuestion","questionGroups_questions_naEnabled":true,"questionGroups_questions_answerOptions":{"id":"12345","text":"0","value":0},"questionGroups_questions_isKill":false,"questionGroups_questions_isCritical":false,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"13","questionGroups.name":"hi","questionGroups.type":"questionGroup","questionGroups_manualWeight":false}
{"questionGroups_questions_id":"33123","questionGroups_questions_text":"Respond with welcome","questionGroups_questions_helpText":"","questionGroups_questions_type":"multipleChoiceQuestion","questionGroups_questions_naEnabled":true,"questionGroups_questions_answerOptions":{"id":"847y59","text":"1","value":2},"questionGroups_questions_isKill":false,"questionGroups_questions_isCritical":false,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"13","questionGroups.name":"hi","questionGroups.type":"questionGroup","questionGroups_manualWeight":false}
{"questionGroups_questions_id":"67234","questionGroups_questions_text":"Respond with hello","questionGroups_questions_helpText":"","questionGroups_questions_type":"ChoiceQuestion","questionGroups_questions_naEnabled":true,"questionGroups_questions_answerOptions":{"id":"12345","text":"0","value":0},"questionGroups_questions_isKill":false,"questionGroups_questions_isCritical":false,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"13","questionGroups.name":"hi","questionGroups.type":"questionGroup","questionGroups_manualWeight":false}
{"questionGroups_questions_id":"67234","questionGroups_questions_text":"Respond with hello","questionGroups_questions_helpText":"","questionGroups_questions_type":"ChoiceQuestion","questionGroups_questions_naEnabled":true,"questionGroups_questions_answerOptions":{"id":"sdkmfkj","text":"1","value":2},"questionGroups_questions_isKill":false,"questionGroups_questions_isCritical":false,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"13","questionGroups.name":"hi","questionGroups.type":"questionGroup","questionGroups_manualWeight":false}

2

Answers


  1. IIUC:

    You can add all paths to record_path and net the meta path:

    Code:

    print(pd.json_normalize(
        data=data,
        record_path=[
            "questionGroups",
            "questions",
            "answerOptions"
        ],
        meta=[
            "id", "name", "modifiedDate",
            ["questionGroups", "id"], ["questionGroups", "name"], ["questionGroups", "type"],
            ["questionGroups", "manualWeight"],
            ["questionGroups", "questions", "isKill"], ["questionGroups", "questions", "isCritical"],
            ["questionGroups", "questions", "id"], ["questionGroups", "questions", "text"],
            ["questionGroups", "questions", "helpText"],
        ],
        record_prefix="meta."
    ).to_json(orient="records", lines=True))
    

    Output:

    {"meta.id":"12345","meta.text":"0","meta.value":0,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"12","questionGroups.name":"Welcome","questionGroups.type":"questionGroup","questionGroups.manualWeight":false,"questionGroups.questions.isKill":false,"questionGroups.questions.isCritical":false,"questionGroups.questions.id":"123","questionGroups.questions.text":"Respond with welcome","questionGroups.questions.helpText":""}
    {"meta.id":"847y59","meta.text":"1","meta.value":2,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"12","questionGroups.name":"Welcome","questionGroups.type":"questionGroup","questionGroups.manualWeight":false,"questionGroups.questions.isKill":false,"questionGroups.questions.isCritical":false,"questionGroups.questions.id":"123","questionGroups.questions.text":"Respond with welcome","questionGroups.questions.helpText":""}
    {"meta.id":"12345","meta.text":"0","meta.value":0,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"12","questionGroups.name":"Welcome","questionGroups.type":"questionGroup","questionGroups.manualWeight":false,"questionGroups.questions.isKill":false,"questionGroups.questions.isCritical":false,"questionGroups.questions.id":"234","questionGroups.questions.text":"Respond with hello","questionGroups.questions.helpText":""}
    {"meta.id":"sdkmfkj","meta.text":"1","meta.value":2,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"12","questionGroups.name":"Welcome","questionGroups.type":"questionGroup","questionGroups.manualWeight":false,"questionGroups.questions.isKill":false,"questionGroups.questions.isCritical":false,"questionGroups.questions.id":"234","questionGroups.questions.text":"Respond with hello","questionGroups.questions.helpText":""}
    {"meta.id":"12345","meta.text":"0","meta.value":0,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"13","questionGroups.name":"hi","questionGroups.type":"questionGroup","questionGroups.manualWeight":false,"questionGroups.questions.isKill":false,"questionGroups.questions.isCritical":false,"questionGroups.questions.id":"33123","questionGroups.questions.text":"Respond with welcome","questionGroups.questions.helpText":""}
    {"meta.id":"847y59","meta.text":"1","meta.value":2,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"13","questionGroups.name":"hi","questionGroups.type":"questionGroup","questionGroups.manualWeight":false,"questionGroups.questions.isKill":false,"questionGroups.questions.isCritical":false,"questionGroups.questions.id":"33123","questionGroups.questions.text":"Respond with welcome","questionGroups.questions.helpText":""}
    {"meta.id":"12345","meta.text":"0","meta.value":0,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"13","questionGroups.name":"hi","questionGroups.type":"questionGroup","questionGroups.manualWeight":false,"questionGroups.questions.isKill":false,"questionGroups.questions.isCritical":false,"questionGroups.questions.id":"67234","questionGroups.questions.text":"Respond with hello","questionGroups.questions.helpText":""}
    {"meta.id":"sdkmfkj","meta.text":"1","meta.value":2,"id":"1","name":"test1","modifiedDate":"2023-10-17T16:20:09.091Z","questionGroups.id":"13","questionGroups.name":"hi","questionGroups.type":"questionGroup","questionGroups.manualWeight":false,"questionGroups.questions.isKill":false,"questionGroups.questions.isCritical":false,"questionGroups.questions.id":"67234","questionGroups.questions.text":"Respond with hello","questionGroups.questions.helpText":""}
    
    Login or Signup to reply.
  2. I would just manually walk your json loaded as a dictionary via json.load() and assemble each row:

    results = [
        {
            "questionGroups_questions_id": question["id"],
            "questionGroups_questions_text": question["text"],
            "questionGroups_questions_helpText": question["helpText"],
            "questionGroups_questions_type": question["type"],
            "questionGroups_questions_naEnabled": question["naEnabled"],
            "questionGroups_questions_answerOptions": answer_option,
            "questionGroups_questions_isKill": question["isKill"],
            "questionGroups_questions_isCritical": question["isKill"],
            "id": json_data["id"],
            "name": json_data["name"],
            "modifiedDate": json_data["modifiedDate"],
            "questionGroups.id": question_group["id"],
            "questionGroups.name": question_group["name"],
            "questionGroups.type": question_group["type"],
            "questionGroups_manualWeight": question_group["manualWeight"]
        }
        for question_group in json_data["questionGroups"]
        for question in question_group["questions"]
        for answer_option in question["answerOptions"]
    ]
    
    print(len(results))
    
    import json
    for result in results:
        print(json.dumps(result))
    

    That should give you:

    8
    {"questionGroups_questions_id": "123", "questionGroups_questions_text": "Respond with welcome", "questionGroups_questions_helpText": "", "questionGroups_questions_type": "multipleChoiceQuestion", "questionGroups_questions_naEnabled": true, "questionGroups_questions_answerOptions": {"id": "12345", "text": "0", "value": 0}, "questionGroups_questions_isKill": false, "questionGroups_questions_isCritical": false, "id": "1", "name": "test1", "modifiedDate": "2023-10-17T16:20:09.091Z", "questionGroups.id": "12", "questionGroups.name": "Welcome", "questionGroups.type": "questionGroup", "questionGroups_manualWeight": false}
    {"questionGroups_questions_id": "123", "questionGroups_questions_text": "Respond with welcome", "questionGroups_questions_helpText": "", "questionGroups_questions_type": "multipleChoiceQuestion", "questionGroups_questions_naEnabled": true, "questionGroups_questions_answerOptions": {"id": "847y59", "text": "1", "value": 2}, "questionGroups_questions_isKill": false, "questionGroups_questions_isCritical": false, "id": "1", "name": "test1", "modifiedDate": "2023-10-17T16:20:09.091Z", "questionGroups.id": "12", "questionGroups.name": "Welcome", "questionGroups.type": "questionGroup", "questionGroups_manualWeight": false}
    {"questionGroups_questions_id": "234", "questionGroups_questions_text": "Respond with hello", "questionGroups_questions_helpText": "", "questionGroups_questions_type": "ChoiceQuestion", "questionGroups_questions_naEnabled": true, "questionGroups_questions_answerOptions": {"id": "12345", "text": "0", "value": 0}, "questionGroups_questions_isKill": false, "questionGroups_questions_isCritical": false, "id": "1", "name": "test1", "modifiedDate": "2023-10-17T16:20:09.091Z", "questionGroups.id": "12", "questionGroups.name": "Welcome", "questionGroups.type": "questionGroup", "questionGroups_manualWeight": false}        
    {"questionGroups_questions_id": "234", "questionGroups_questions_text": "Respond with hello", "questionGroups_questions_helpText": "", "questionGroups_questions_type": "ChoiceQuestion", "questionGroups_questions_naEnabled": true, "questionGroups_questions_answerOptions": {"id": "sdkmfkj", "text": "1", "value": 2}, "questionGroups_questions_isKill": false, "questionGroups_questions_isCritical": false, "id": "1", "name": "test1", "modifiedDate": "2023-10-17T16:20:09.091Z", "questionGroups.id": "12", "questionGroups.name": "Welcome", "questionGroups.type": "questionGroup", "questionGroups_manualWeight": false}      
    {"questionGroups_questions_id": "33123", "questionGroups_questions_text": "Respond with welcome", "questionGroups_questions_helpText": "", "questionGroups_questions_type": "multipleChoiceQuestion", "questionGroups_questions_naEnabled": true, "questionGroups_questions_answerOptions": {"id": "12345", "text": "0", "value": 0}, "questionGroups_questions_isKill": false, "questionGroups_questions_isCritical": false, "id": "1", "name": "test1", "modifiedDate": "2023-10-17T16:20:09.091Z", "questionGroups.id": "13", "questionGroups.name": "hi", "questionGroups.type": "questionGroup", "questionGroups_manualWeight": false} 
    {"questionGroups_questions_id": "33123", "questionGroups_questions_text": "Respond with welcome", "questionGroups_questions_helpText": "", "questionGroups_questions_type": "multipleChoiceQuestion", "questionGroups_questions_naEnabled": true, "questionGroups_questions_answerOptions": {"id": "847y59", "text": "1", "value": 2}, "questionGroups_questions_isKill": false, "questionGroups_questions_isCritical": false, "id": "1", "name": "test1", "modifiedDate": "2023-10-17T16:20:09.091Z", "questionGroups.id": "13", "questionGroups.name": "hi", "questionGroups.type": "questionGroup", "questionGroups_manualWeight": false}
    {"questionGroups_questions_id": "67234", "questionGroups_questions_text": "Respond with hello", "questionGroups_questions_helpText": "", "questionGroups_questions_type": "ChoiceQuestion", "questionGroups_questions_naEnabled": true, "questionGroups_questions_answerOptions": {"id": "12345", "text": "0", "value": 0}, "questionGroups_questions_isKill": false, "questionGroups_questions_isCritical": false, "id": "1", "name": "test1", "modifiedDate": "2023-10-17T16:20:09.091Z", "questionGroups.id": "13", "questionGroups.name": "hi", "questionGroups.type": "questionGroup", "questionGroups_manualWeight": false}
    {"questionGroups_questions_id": "67234", "questionGroups_questions_text": "Respond with hello", "questionGroups_questions_helpText": "", "questionGroups_questions_type": "ChoiceQuestion", "questionGroups_questions_naEnabled": true, "questionGroups_questions_answerOptions": {"id": "sdkmfkj", "text": "1", "value": 2}, "questionGroups_questions_isKill": false, "questionGroups_questions_isCritical": false, "id": "1", "name": "test1", "modifiedDate": "2023-10-17T16:20:09.091Z", "questionGroups.id": "13", "questionGroups.name": "hi", "questionGroups.type": "questionGroup", "questionGroups_manualWeight": false}
    

    You can write this out with json.dump()

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search