skip to Main Content

I’m a beginner in the field of artificial intelligence… I can use GATE or any other Natural Language Processing but I don’t have an answer for this :

Do you know how to evaluate how 2 sentences can be close? even with a large data set?

Do you have any recommendations? I can use the number of permutation, the lengh, the number of tokens, metaphone them, etc… but I don’t know what test I should use.

My goal is :
– “Hello Jarvis”
– “Hello Romain, how are you”

- "Hello arvis"
- "Hello Romain, how are you"

- "Hello mister Swift"
- I don't know what you are expecting, is this like "Hello Jarvis" ?
- Yes
- Ok, Hello Romain, How are you?

- "Hello mister swift, how are you?"
- I don't know what are you expecting.


By 1, 2, 3 or n is just an example of similarity scale.


- "Hello IA" is closed to
   - "Hello IA" by 0
   - "Hello AI" by 1 

- "Hello Jarvis" is closed to 
   - "Hello AI" by 2 
   - "Hello IA" by 2

- "Hello! mister Swift" is closed to
   - "Hello AI" by 3
   - "Hello IA" by 3
   - "Hello Jarvis" by 2

Less Basic

- "Hello IA" is (token length, token word, grammatically, syntactically) closed to
   - "Hello IA" by (0,0,0,0)
   - "Hello AI" by (0,1,0,0) 

- "Hello Jarvis" is closed to 
   - "Hello AI" by (0,2,1,1) 
   - "Hello IA" by (0,2,1,1)

- "Hello! mister Swift" is closed to
   - "Hello AI" by (1,2,2,2)
   - "Hello IA" by (1,2,2,2)
   - "Hello Jarvis" by (1,2,2,2)



  1. If you are ready to learn hard-core NLP, you may use a classifier for this task. Have a look for instance at Stanford NLP (Java) or NLTK (Python).

    If you want to keep things simple and use an out-of-the-box solution, have a look at the API it does exactly what you need, and more.

    Login or Signup to reply.
  2. One way to determine string similarity is to use String kernels. There’s a good paper by Lodhi et al explaining how this works:

    In order to create a classifier using CoreNLP you would have to create features for the string, such as n-grams, lemmas or similar.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top