skip to Main Content

Here is a simple code to use Redis and embeddings but It’s not clear how can I build and load own embeddings and then pull it from Redis and use in search

from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores.redis import Redis

embeddings = OpenAIEmbeddings
metadata = [
    {
        "user": "john",
        "age": 18,
        "job": "engineer",
        "credit_score": "high"
    }
]
texts = ["foo", "foo", "foo", "bar", "bar"]

rds = Redis.from_texts(
    texts,
    embeddings,
    metadata,
    redis_url="redis://localhost:6379",
    index_name="users",
)

results = rds.similarity_search("foo")
print(results[0].page_content)

But I want to load a text from e.g. text file, create embedings and load into Redis for later use. Something like this:

from openai import OpenAI
client = OpenAI()

def get_embedding(text, model="text-embedding-ada-002"):
    text = text.replace("n", " ")
    return client.embeddings.create(input = [text], model=model).data[0].embedding

Does anyone have good example to implement this approach? Also wondering about TTL for embedings in Redis

2

Answers


  1. Seems that the package requires the provided embeddings object to conform to langchain_core.embeddings.Embeddings. Looking at the source code, you could go about creating a custom embedder object by implementing the two required methods embed_query and embed_document:

    from langchain_core.embeddings import Embeddings
    class CustomEmbeddings(Embeddings):
        # pass any args/setup here
        def __init__(self, model):
            self.model = model
            self.client = OpenAI()
    
        def embed_documents(self, texts):
            return [self.embed_query(text) for text in texts]
    
        def embed_query(self, text):
            return self._get_embedding(text)        
    
        # your custom get_embedding function
        def _get_embedding(self, text):
            text = text.replace("n", " ")
            return self.client.embeddings.create(input = [text], model=model).data[0].embedding
    
    embeddings = CustomEmbeddings("model_name")
    
    # rest of your code as previous
    
    Login or Signup to reply.
  2. Helloļ¼ You can use the TextLoader to load txt and split it into documents!

    Just like below:

    from langchain.embeddings import OpenAIEmbeddings
    from langchain.vectorstores.redis import Redis
    from langchain.document_loaders import TextLoader
    from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings
    from langchain.text_splitter import CharacterTextSplitter
    
    
    embeddings = OpenAIEmbeddings()
    
    loader = TextLoader("union.txt", encoding="utf-8")
    
    documents = loader.load()
    
    text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
    docs = text_splitter.split_documents(documents)
    
    vectorstore = Redis.from_documents(
        docs,
        embeddings,
        redis_url="redis://localhost:6379",
        index_name="users",
    )
    
    
    results = rds.similarity_search_with_score("He met the Ukrainian people.")
    print(results)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search