skip to Main Content

I have json that looks like this

{"k2": 39, "k1": 52}
{"k2": 39, "k1": 52}
{"k3": 66, "k2": 38}
{"k2": 35}
{}
{}
{"k1": 52, "k2": 39}

I need to delete all duplicated dicts from this json

I was trying to use set comprehension with tuple but it doesnt work i’ve got an error

'str' object has no attribute 'items'

code.py

import json

a = open('aboba.json', 'r')

data = a.read()

get_json = json.loads(json.dumps(data))



delete_all_dublicates = [dict(t) for t in {tuple(d.items()) for d in get_json}]

print(delete_all_dublicates)

2

Answers


  1. If I understood you correct, by provided steps you can remove duplicates:

    1. Split JSON data into individual JSON strings.
    2. Convert each JSON string to a Python dictionary and sort it’s keys
    3. Use a set to keep track of unique dictionaries
    4. Remove duplicates by adding sorted dictionaries to the set
    5. Convert the tuples back to dictionaries
    6. Convert the result list back to a JSON string
    import json
    
    json_data = '''
        {"k2": 39, "k1": 52}
        {"k2": 39, "k1": 52}
        {"k3": 66, "k2": 38}
        {"k2": 35}
        {}
        {}
        {"k1": 52, "k2": 39}
    '''
    
    json_list = [json_str.strip() for json_str in json_data.split('n') if json_str.strip()]
    
    dict_list = [json.loads(json_str) for json_str in json_list]
    sorted_dicts = [dict(sorted(d.items())) for d in dict_list]
    
    unique_dicts = set()
    
    unique_dicts = {tuple(d.items()) for d in sorted_dicts}
    
    result = [dict(d) for d in unique_dicts]
    
    result_json = 'n'.join(json.dumps(d) for d in result)
    
    print(result_json)
    
    Login or Signup to reply.
  2. This is my answer using the operator **:

    json_data = '''{"k2": 39, "k1": 52}
    {"k2": 39, "k1": 52}
    {"k3": 66, "k2": 38}
    {"k2": 35}
    {}
    {}
    {"k1": 52, "k2": 39}'''
    
    import json
    dict_lst = [json.loads(d) for d in json_data.split("n")]
    
    d_all = dict()
    
    for d in dict_lst:
        # ** turns the dictionary into keyword parameters:
        d_all = dict(d_all, **d)
        
    d_all
    # {'k2': 39, 'k1': 52, 'k3': 66}
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search