skip to Main Content

Consider the following example
"10% of on all Artificial Intelligence courses."
In this example, I have to extract two predefined classes like Artificial Intelligence and courses. Even the program has to classify words like ANN, CNN, RNN, AI, etc. into the Artificial Intelligence category. I have used spacy to train but I am not impressed with the results as it is not labeling correctly. Is there any alternative to extract entities from a sentence in Python?

2

Answers


  1. Here are the few options that I would try out.

    1.Custom entity extraction with Rasa.

    https://rasa.com/docs/rasa/nlu/entity-extraction/#custom-entities
    
    1. Bert based NER for Custom entities. Take a look at the following repositories
    https://github.com/allenai/scibert
    https://github.com/dmis-lab/biobert
    
    Login or Signup to reply.
  2. You can use flashtext for doing this.

    from flashtext import KeywordProcessor
    
    kp = KeywordProcessor()
    
    # make a dictionary and create key , insert all keyword in one key (i.e CNN, ANN RNN will come under artificial Intelligence, whenever this value will appear it will extract key for you ) 
    dict_= {'Artificial Intelligence': ['ANN','CNN','RNN','AI','Artificial Intelligence'],'courses' : ['courses']} 
    
    kp.add_keywords_from_dict(dict_)
    
    # here Artificial Intelligence, ANN and CNN come under Artificial Intelligence key , that why it will extract the tag as Artificial Intelligence
    kp.extract_keywords('10% of on all Artificial Intelligence, ANN, and CNN courses.')
    #op
    ['Artificial Intelligence',
     'Artificial Intelligence',
     'Artificial Intelligence',
     'courses']
    

    for more information you can follow the documentation of flashtext https://readthedocs.org/projects/flashtext/downloads/pdf/latest/

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search