skip to Main Content
llm = AzureOpenAI(openai_api_key=OPENAI_API_KEY, deployment_name=OPENAI_DEPLOYMENT_NAME, model_name=MODEL_NAME)



# Configure the location of the PDF file.
pdfReader = PdfReader('databorders.pdf')


# Extract the text from the PDF file.
raw_text = ''
for i, page in enumerate(pdfReader.pages):
    text = page.extract_text()
    if text:
        raw_text += text

# Show first 1000 characters of the text.
raw_text[:1000]


# Split the text into chunks of 1000 characters with 200 characters overlap.
text_splitter = CharacterTextSplitter(        
    separator = "n",
    chunk_size = 1000,
    chunk_overlap  = 200,
    length_function = len,
)
pdfTexts = text_splitter.split_text(raw_text)


# Show how many chunks of text are generated.
len(pdfTexts)

# Pass the text chunks to the Embedding Model from Azure OpenAI API to generate embeddings.
embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY, deployment=OPENAI_EMBEDDING_MODEL_NAME, client="azure", chunk_size=1)

# Use FAISS to index the embeddings. This will allow us to perform a similarity search on the texts using the embeddings.
# https://python.langchain.com/en/latest/modules/indexes/vectorstores/examples/faiss.html
pdfDocSearch = FAISS.from_texts(pdfTexts, embeddings)

# Create a Question Answering chain using the embeddings and the similarity search.
# https://docs.langchain.com/docs/components/chains/index_related_chains
chain = load_qa_chain(llm, chain_type="stuff")


# Perform first sample of question answering.
inquiry = "Who is the author of this book?"
docs = pdfDocSearch.similarity_search(inquiry)
chain.run(input_documents=docs, question=inquiry)

It gives this error:
openai.error.InvalidRequestError: The completion operation does not work with the specified model, gpt-4. Please choose different model and try again. You can learn more about which models can be used with each operation here: https://go.microsoft.com/fwlink/?linkid=2197993.

2

Answers


  1. It gives this error: openai.error.InvalidRequestError: The completion
    operation does not work with the specified model, gpt-4. Please choose
    a different model and try again. You can learn more about which models
    can be used with each operation here.

    The above error occurs when you pass the wrong model or incorrect deployment in the configuration.

    According to this Document-1 and Document-2 you need
    text-davinci-003 model for completion and text-embedding-ada-002 model for embedding.

    When I tried with the above model the code executed and gave me output.

    Code:

    from langchain.llms import AzureOpenAI
    from PyPDF2 import PdfReader
    from langchain.text_splitter import CharacterTextSplitter
    from langchain.embeddings.openai import OpenAIEmbeddings
    from langchain.vectorstores.faiss import FAISS
    from langchain.chains.question_answering import load_qa_chain
    
    OPENAI_API_KEY="xxxxx"
    OPENAI_DEPLOYMENT_NAME="testxxxa"    #deployment name with text-embedding-ada-002 model
    deployment="textxxx"     #deployment name with text-davinci-003 model
    openai_api_base1="xxxxxx"
    
    llm = AzureOpenAI(openai_api_key=OPENAI_API_KEY, deployment_name=deployment,openai_api_base=openai_api_base1,openai_api_version="2022-12-01",openai_api_type="azure")
    
    pdfReader = PdfReader('example.pdf')
    
    raw_text = ''
    for i, page in enumerate(pdfReader.pages):
        text = page.extract_text()
        if text:
            raw_text += text
    
    raw_text[:1000]
    
    text_splitter = CharacterTextSplitter(        
        separator = "n",
        chunk_size = 1000,
        chunk_overlap  = 200,
        length_function = len,
    )
    pdfTexts = text_splitter.split_text(raw_text)
    
    len(pdfTexts)
    
    embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY, deployment=OPENAI_DEPLOYMENT_NAME, openai_api_base=openai_api_base1, openai_api_type="azure", openai_api_version="2022-12-01",chunk_size=1)
    
    pdfDocSearch = FAISS.from_texts(pdfTexts, embeddings)
    chain = load_qa_chain(llm, chain_type="stuff")
    inquiry = "Which month is specified?"
    docs = pdfDocSearch.similarity_search(inquiry)
    print(chain.run(input_documents=docs, question=inquiry))
    

    Output:

     September
    

    enter image description here

    Login or Signup to reply.
  2. In OpenAI, you have to main operations regarding text generation:

    • completion
    • chatCompletion

    Some models can be used for completion (eg: GPT3.5 version 0301, GPT-4, etc.), other can be used for chatCompletion (eg: GPT3.5 version 0613, GPT-4, etc.).

    There is something that is not visible in your code which is the fact that langchain will use OpenAI with a completion operation within its step load_qa_chain.

    Doc: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability
    GPT4 Details

    GPT 3.5 details

    So in your case, you should pass a deployment which is compliant with a completion query when you set your llm:

    llm = AzureOpenAI(openai_api_key=OPENAI_API_KEY, deployment_name=OPENAI_DEPLOYMENT_NAME, model_name=MODEL_NAME)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search