skip to Main Content

I have to create two dictionaries and assign key and values. When key is employee id, the value would be interest. Then, when key is interests the value would be employee id.
Then I have to print these dictionaries.
I have to open/read the text file first.

So, far I’ve got:

file = open("interests.txt", "r")

people = {}

for row in file:
    employee_id = int(row[0])
    people[employee_id] = {
        'interests': row[2:]

        }

from pprint import pprint
pprint (people)

I only this as a result:

{0: {'interests': 'Cassandran'},
 1: {'interests': 'Postgresn'},
 2: {'interests': 'pandasn'},
 3: {'interests': 'probabilityn'},
 4: {'interests': 'libsvmn'},
 5: {'interests': 'programming languagesn'},
 6: {'interests': 'theoryn'},
 7: {'interests': 'neural networksn'},
 8: {'interests': 'artificial intelligencen'},
 9: {'interests': 'Big Data'}}

But I have to get all the interests that match with the employee_id.

Please help me.

2

Answers


  1. You’re halfway there. When you parse new row, right now you are replacing the interest value in the dictionary at the key with an entirely new interest. Instead, have the value at that key be a list to which you append the new interest value:

    for row in file:
        employee_id = int(row[0])
        interest = row[2:]
    
        if employee_id not in people:
            people[employee_id] = []
    
        people[employee_id].append(interest)
    

    With this, you will get a dictionary with each ID mapped to the corresponding interests. To have a dictionary where each interest is mapped to the corresponding IDs, you can simply do the same operation in reverse. (Which I will leave to you as a learning exercise. 🙂 )

    Login or Signup to reply.
  2. You’re overwriting the previous values of the same key by using a dict of dicts. You can instead use dict.setdefault to initialize each entry of a new key of a dict with a list so that you can keep appending items to it:

    people = {}
    interests = {}
    for line in file:
        employee_id, interest = line.split(maxsplit=1)
        employee_id = int(employee_id)
        interest = interest.rstrip()
        people.setdefault(employee_id, []).append(interest)
        interests.setdefault(interest, []).append(employee_id)
    

    people becomes:

    {0: ['Hadoop', 'Big Data', 'HBas', 'Java', 'Spark', 'Storm', 'Cassandra'], 1: ['NoSQL', 'MongoDB', 'Cassandra', 'HBase', 'Postgres'], 2: ['Python', 'skikit-learn', 'scipy', 'numpy', 'statsmodels', 'pandas'], 3: ['R', 'Python', 'statistics', 'regression', 'probability'], 4: ['machine learning', 'regression', 'decision trees', 'libsvm'], 5: ['Python', 'R', 'Java', 'C++', 'Haskell', 'programming languages'], 6: ['statistics', 'probability', 'mathematics', 'theory'], 7: ['machine learning', 'scikit-learn', 'Mahout', 'neural networks'], 8: ['neural networks', 'deep learning', 'Big Data', 'artificial intelligence'], 9: ['Hadoop', 'Java', 'MapReduce', 'Big Data']}
    

    interests becomes:

    {'Hadoop': [0, 9], 'Big Data': [0, 8, 9], 'HBas': [0], 'Java': [0, 5, 9], 'Spark': [0], 'Storm': [0], 'Cassandra': [0, 1], 'NoSQL': [1], 'MongoDB': [1], 'HBase': [1], 'Postgres': [1], 'Python': [2, 3, 5], 'skikit-learn': [2], 'scipy': [2], 'numpy': [2], 'statsmodels': [2], 'pandas': [2], 'R': [3, 5], 'statistics': [3, 6], 'regression': [3, 4], 'probability': [3, 6], 'machine learning': [4, 7], 'decision trees': [4], 'libsvm': [4], 'C++': [5], 'Haskell': [5], 'programming languages': [5], 'mathematics': [6], 'theory': [6], 'scikit-learn': [7], 'Mahout': [7], 'neural networks': [7, 8], 'deep learning': [8], 'artificial intelligence': [8], 'MapReduce': [9]}
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search