skip to Main Content

I have one json file which has some duplicates based on a column called userid, and want to remove those duplicates using append() function so that it can output a new file with the same format as the original one.

Here is a json file:

    [
        {
            "userid": "7126521576",
            "status": "UserStatus.OFFLINE",
            "name": "Avril Pauling",
            "bot": false,
            "username": "None"
        },
      {
            "userid": "7126521576",
            "status": "UserStatus.OFFLINE",
            "name": "Avril Pauling",
            "bot": false,
            "username": "None"
        },
        {
            "userid": "6571627119",
            "status": "UserStatus.OFFLINE",
            "name": "Laverne Alferez",
            "bot": false,
            "username": "None"
        },
        {
            "userid": "1995422560",
            "status": "UserStatus.OFFLINE",
            "name": "098767800",
            "bot": false,
            "username": "None"
        }
    ]

The output file after removing duplicated userids should be:

    [
        {
            "userid": "7126521576",
            "status": "UserStatus.OFFLINE",
            "name": "Avril Pauling",
            "bot": false,
            "username": "None"
        },
        {
            "userid": "6571627119",
            "status": "UserStatus.OFFLINE",
            "name": "Laverne Alferez",
            "bot": false,
            "username": "None"
        },
        {
            "userid": "1995422560",
            "status": "UserStatus.OFFLINE",
            "name": "098767800",
            "bot": false,
            "username": "None"
        }
    ]

I have tried the following codes, append() function appeas to not working correctly; it only append the last item:

    import json
    with open('target_user.json', 'r', encoding='utf-8') as f:
        jsons = json.load(f)

    jsons2 = []
    for item in jsons:
        if item['userid'] not in json2:
            jsons2.append(item)
            
    with open('target_user2.json', 'w', encoding='utf-8') as nf:
        json.dump(jsons2, nf, indent=4)

A quick help is very appreciated.

2

Answers


  1. This should do what you need:

    import json
    with open('target_user.json', 'r', encoding='utf-8') as f:
        jsons = json.load(f)
    
    ids = set()
    jsons2 = []
    for item in jsons:
        if item['userid'] not in ids:
            ids.add(item['userid'])
            jsons2.append(item)
            
    with open('target_user2.json', 'w', encoding='utf-8') as nf:
        json.dump(jsons2, nf, indent=4)
    
    Login or Signup to reply.
  2. You have a typo here: jsons2 json2

    import json
    
    with open('target_user.json', 'r', encoding='utf-8') as f:
        jsons = json.load(f)
    
    jsons2 = []
    for item in jsons:
        if item['userid'] not in json2:
            jsons2.append(item)
            
    with open('target_user2.json', 'w', encoding='utf-8') as nf:
        json.dump(jsons2, nf, indent=4)
    

    You define jsons2 but if item['userid'] not in json2: so, json2 should jsons2 here.

    Try again with corrected code.

    I don’t have enough reputation to comments, so I answered.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search