skip to Main Content

I want to merge two lists and get data matched without duplicates and to alias them to new structure
I have two lists

here is a given two list try to merge

cats = [
    'orange', 'apple', 'banana'
]

and second list

types = [
    {
        "id": 1,
        "type": "orange"
    },
    {
        "id": 2,
        "type": "apple"
    },
    {
        "id": 3,
        "type": "apple"
    },
    {
        "id": 4,
        "type": "orange"
    },
    {
        "id": 5,
        "type": "banana"
    }
]

and I want to combine them to get this result:

[
    {'orange': {
        'UNIT': [1, 4]
    }
    },
    {'apple': {
        'UNIT': [2, 3]
    }
    },
    {'banana': {
        'UNIT': [5]
    }
    }
]

and my code, this after my tries i get this result :

for item in types:
    for cat in cats:
        if item['type'] == cat:
     
            matched.append(
                {
                    cat: {
                        "UNIT": [i['id'] for i in types if
                                 'id' in i]
                    }
                }
            )

and my result is like this


[{'orange': {'UNIT': [1, 2, 3, 4, 5]}},
 {'apple': {'UNIT': [1, 2, 3, 4, 5]}},
 {'apple': {'UNIT': [1, 2, 3, 4, 5]}},
 {'orange': {'UNIT': [1, 2, 3, 4, 5]}},
 {'banana': {'UNIT': [1, 2, 3, 4, 5]}}]

3

Answers


  1. With list comprehension:

    [{cat:{'UNIT':[type_['id'] for type_ in types if type_['type'] == cat]}} for cat in cats]
    
    Login or Signup to reply.
  2. Your problem is the in inside your list comprehension – beside that your code is complex. You get multiples due to your for loops and never checking if that fruit was already added to matched.

    To reduce the 2 lists to the needed values you can use

    cats = [    'orange', 'apple', 'banana']
    
    types = [    { "id": 1, "type": "orange" },
        { "id": 2, "type": "apple" },
        { "id": 3, "type": "apple" },
        { "id": 4, "type": "orange" },
        { "id": 5, "type": "banana" }]
    
    rv = {}
    for inner in types:
        t = inner["type"]
        if t not in cats: # for current data not needed, only needed
            continue      # if some listelements don't occure in dict
        rv.setdefault(t, [])
        rv[t].append(inner["id"])
            
    print(rv)
    

    which leads to an easier dictionary with all the data you need:

    {'orange': [1, 4], 'apple': [2, 3], 'banana': [5]}
    

    From there you can build up your overly complex list of dicts with 1 key each:

    
    lv = [{k:{"UNIT":v}} for k,v in rv.items()]
    print (lv)
    

    to get

    [{'orange': {'UNIT': [1, 4]}}, 
     {'apple': {'UNIT': [2, 3]}}, 
     {'banana': {'UNIT': [5]}}]
    

    Answer to extended problem from comment:
    If you need to add more things you need to capture you can leverage the fact that lists and dicts store by ref:

    cats = [    'orange', 'apple', 'banana']
    types = [    { "id": 1, "type": "orange" , "bouncyness": 42 },
                 { "id": 2, "type": "apple"  , "bouncyness": 21 },    
                 {"id": 3, "type": "apple"  , "bouncyness": 63},    
                 { "id": 4, "type": "orange"  , "bouncyness": 84},    
                 { "id": 5, "type": "banana"  , "bouncyness": 99}]
    
    rv = []   # list of dicts - single key is "fruitname"
    pil = {}  # dict that keeps track which fruit is on what position in rv
              # to avoid iterating over list to find correct fruit-dict
    
    di = None # the current fruits dictionary 
    
    for inner in types:
        t = inner["type"]
        if t not in cats: # for current data not needed, only needed
            continue      # if some listelements don't occure in dict
    
        # step1: find correct fruit dict in rv or create new one and add it
        di = None
        if t in pil:
            # get cached dict by position of fruit in rv
            di = rv[pil[t]]
        else:
            # create fruit dict, cache position in rv in pil
            di = {}
            rv.append(di)
            pil[t] = len(rv)-1
    
        # step2: create all the needed inner lists         
        #   you can speed this up using defaultdict(list) if speed gets
        #   problematic - until then dict.setdefault should be fine
        di.setdefault(t, [])
        di.setdefault("bouncyness", [])
    
        # step3: fill with values
        di[t].append(inner["id"])
        di["bouncyness"].append(inner["bouncyness"])
            
    print(rv)
    

    to get

    [{'orange': [1, 4], 'bouncyness': [42, 84]}, 
     {'apple': [2, 3], 'bouncyness': [21, 63]}, 
     {'banana': [5], 'bouncyness': [99]}]
    
    Login or Signup to reply.
  3. Here is an alternative approach using filter() method and lambda.

    dict_lst = []
    for cat in cats:
        cat_items = filter(lambda x: x['type'] == cat, types)
        cat_dict = {cat: {'UNIT': [x['id'] for x in cat_items]}} 
        dict_lst.append(cat_dict)
    print(dict_lst)
    

    [{'orange': {'UNIT': [1, 4]}},
     {'apple': {'UNIT': [2, 3]}},
     {'banana': {'UNIT': [5]}}]
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search