skip to Main Content

I have a big nested dict nested_dict which was created using parallel processing, resulting in DictProxy objects at each level. To avoid having to re-run the creation of this dict which takes hours I want to save everything in a JSON file. As per How to convert a DictProxy object into JSON serializable dict? it is possible to convert a DictProxy object to a dict, and then make it JSON. But since I have DictProxy objects nested, running json.dumps(nested_dict.copy()) returns TypeError: Object of type DictProxy is not JSON serializable.

Is there an efficient way to recursively convert all DictProxy objects to dict to allow saving in a JSON file?

2

Answers


  1. Chosen as BEST ANSWER

    Simply creating a new empty dict and populating it using for loops over the keys until the most inner dict solved it:

    new_dict = {}
    for first_key in nested_dict.keys():
        new_dict[first_key] = {}
        for second_key in nested_dict[first_key].keys():
            new_dict[first_key][second_key] = {}
            ...
            for last_key in nested_dict[first_key][second_key][...][last_key].keys():
                new_dict[first_key][second_key][...][last_key] = a_dictproxy_object.copy()
    

    And then

    import json
    with open("my.json","w") as f:
        json.dump(new_dict,f)
    

    Maybe it isn't the most effective, but it works!


  2. How about some dict comprehension and a little recursion here:

    from multiprocessing import Manager
    from multiprocessing.managers import DictProxy
    
    
    def get_value(d):
        return {
            key: get_value(sub_d)
            if isinstance(sub_d, DictProxy) else sub_d 
            for key, sub_d in d.items()
                }
    
    
    if __name__ == "__main__":
    
        with Manager() as manager:
    
            d1, d2, d3 = manager.dict(), manager.dict(), manager.dict()
    
            d3['d'] = 'end of nested levels'
            d2['d3'] = d3
            d1['d2'] = d2
    
            print(d1)
            print(get_value(d1))
    

    Output

    {'d2': <DictProxy object, typeid 'dict' at 0x236493f1f70>}
    {'d2': {'d3': {'d': 'end of nested levels'}}}
    

    As a bonus, this would even work if there were no DictProxy objects or the dictionary wasn’t nested

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search