skip to Main Content

I want to quickly scan a chat conversation – in close to real time – and detect any food names that are in it. I’ve got a database of 400K foods or so, so the trivial solution of just using a regex in memory won’t scale.

I’ve got a database(Postgres), and I’ve got a programming language (rails).

Any ideas?

2

Answers


  1. Make the 400k words a string, with each word separated by some whitespace, then create the regex from the chat string, and match it to the list string using word anchors.

    Login or Signup to reply.
  2. Consider using PostgreSQL’s Full-Text Search, preprocess the text, batch process if real-time isn’t essential, use caching, define confidence criteria, and optimize using parallel processing to effectively locate food names in a chat conversation and scale with a database of 400K foods. Optionally combine machine learning and natural language processing to increase accuracy.
    Hope it works 🙂

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search