skip to Main Content

I should start by saing: Im not good at programing nonetheless it is extremly fun!
I’m working on a Siri like program and I’m trying to implement a Wikipedia function. To do this I ask a question, for example: tell me about superman

I need to extract the word superman or any other random word that someone might ask from the string. This is not that hard, but the real problems start when someone asks: could you tell me about superman I still want to extract the word superman.

this is an example of what I have tried before:

if ((c.Contains("tell me about")) || (c.Contains("Tell me about")))
{
    string query = c;
    var part = query.Split('t').Last(); //cant search for words containing the letter t like artificial intelligence

    string url = ("http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=" + part + "&MaxHits=1");

    XmlReader reader = XmlReader.Create(url);
    while (reader.Read())
        switch (reader.Name.ToString())
        {
            case "Description":
                sp(reader.ReadString());
                break;

        }
}

I was almost able to solve the problem, seems like this solution works about 80% of the time. However it is a step in the right direction.

     if ((c.Contains("tell me about")) || (c.Contains("Tell me about")))
        {
            string query = c;
            string[] lines = Regex.Split(query, "about ");
            foreach (string line in lines)
            {

            string url = ("http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=" + line + "&MaxHits=1");

                XmlReader reader = XmlReader.Create(url);
                while (reader.Read())

                    switch (reader.Name.ToString())
                    {
                        case "Description":
                            sp(reader.ReadString());
                            break;

                    }
            }

Is there a better/easier way to do this?

2

Answers


  1. Chosen as BEST ANSWER

    I finally found a answer:

        if ((c.Contains("tell me about")) || (c.Contains("Tell me about")))
            {
                string query = c;
                string[] lines = Regex.Split(query, "about ");
                string finalquery = lines[lines.Length - 1];
    
                string url = ("http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=" + finalquery + "&MaxHits=1");
    
                    XmlReader reader = XmlReader.Create(url);
                    while (reader.Read())
    
                        switch (reader.Name.ToString())
                        {
                            case "Description":
                                sp(reader.ReadString());
                                break;
    
                        }
            }
    

    it now works 100% of the time! if someone knows a better way to do this I would be more than happy too hear.


  2. As suggested in the comments if it’s for any kind of production application best option is to use some existing library.

    Still it can be a fun exercise to do it on your own.

    I would say there is many more ways to ask about Superman.

    "what do you know about Superman"
    "let's talk about Superman"
    "who is Superman"
    

    And many many more.

    All the questions are build from some auxiliary words: “what”, “who”, “a”, “about”, and the actual word describing the subject of the question: “Superman”.
    The simplified approach would be to eliminate all the auxiliaries and take whatever remains.

    To quickly build simple list of question words and question phrases I used English grammar site. I took the phrases, and removed the subject of question. This gave me a list of 50-60 auxiliary words for my list.

    Now all I do is to take the sentence and remove all the words that are in the auxiliary list. The code is below:

    class Program
    {
        // All the words collected from the sample question phrases.
        private static string auxStr = @"Who is the Who are Who is that there Where is the Where do you Where are my 
            When do the When is his When are we Why do we Why are they always Why does he What is What is her What is the Which 
            drink did you Which Which is How do you How does he know the answer How can I learn many much often far tell say 
            explain answer for from with about on me he his him her hers your yours they theyr theyrs";
    
        private static List<string> aux = new List<string>();
    
        static void Main(string[] args)
        {
            // Build a list of auxiliary words.
            aux = auxStr.ToLower().Split(' ').Distinct().ToList();
    
            // Test the method to get a subject.
            var subject = GetSubject("Do you know where is Poland", aux);
    
            foreach(var s in subject)
            {
                Console.WriteLine(s);
            }
    
            Console.ReadLine();
        }
    
        private static List<string> GetSubject(string question, List<string> auxiliaries)
        {
            // Convert the question to a list of strings
            var listQuestion = question.ToLower().Split(' ').Distinct().ToList();
    
            // Remove from the question all the words 
            // that are in the list of auxiliary phrases
            var notAux = listQuestion.Where(w => !auxiliaries.Contains(w)).ToList();
    
            return notAux;
        }
    }
    

    It’s quite simplistic but with no effort it narrows the list of potential subject of the question.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search