skip to Main Content

I imported some text files into Azure OpenAI:

enter image description here

enter image description here

After the import, I see a "title" field used for search:

enter image description here

which I can’t edit via UI as it’s greyed out:

enter image description here

How can I define the title for each document? For example, does the Azure OpenAI On Your Data API allow me to define the title for each document?


By default, titles are prepopulated via automated summarization (which seems to be simply truncation?). I can see some titles e.g. via:

import os
import pprint

from openai import AzureOpenAI
#from azure.identity import DefaultAzureCredential, get_bearer_token_provider

endpoint = os.getenv("ENDPOINT_URL", "https://[redacted].openai.azure.com/")
deployment = os.getenv("DEPLOYMENT_NAME", "[redacted GPT engine name]")
search_endpoint = os.getenv("SEARCH_ENDPOINT", "https://[redacted].search.windows.net")
search_key = os.getenv("SEARCH_KEY", "[redacted key]")
search_index = os.getenv("SEARCH_INDEX_NAME", "[redacted]")

# token_provider = get_bearer_token_provider(
#     DefaultAzureCredential(),
#     "https://cognitiveservices.azure.com/.default")

client = AzureOpenAI(
    azure_endpoint=endpoint,
    api_version="2024-05-01-preview",
    api_key='[redacted key]'
)
# azure_ad_token_provider=token_provider,

completion = client.chat.completions.create(
    model=deployment,
    messages=[
        {
            "role": "user",
            "content": "How can I sort a Python list?"
        }],
    max_tokens=800,
    temperature=0,
    top_p=1,
    frequency_penalty=0,
    presence_penalty=0,
    stop=None,
    stream=False,
    extra_body={
        "data_sources": [{
            "type": "azure_search",
            "parameters": {
                "endpoint": f"{search_endpoint}",
                "index_name": "[redacted]",
                "semantic_configuration": "default",
                "query_type": "vector_semantic_hybrid",
                "fields_mapping": {},
                "in_scope": True,
                "role_information": "You are an AI assistant that helps people find information.",
                "filter": None,
                "strictness": 5,
                "top_n_documents": 10,
                "authentication": {
                    "type": "api_key",
                    "key": f"{search_key}"
                },
                "embedding_dependency": {
                    "type": "deployment_name",
                    "deployment_name": "[redacted]"
                }
            }
        }]
    }
)
print(completion.to_json())

outputs:

{
  "id": "7eb67d03-3868-46fe-8cb1-fdf821c633be",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "To acquire [...].",
        "role": "assistant",
        "end_turn": true,
        "context": {
          "citations": [
            {
              "content": "You can copy to your computer  [...]",
              "title": "You can copy [...]",
              "url": "https://[redacted].blob.core.windows.net/fileupload-he/920.txt",
              "filepath": "000920.txt",
              "chunk_id": "0"
            },
            {
              "content": "Dornone of the following:rnChoose File > Automate [...]",
              "title": "Do",
              "url": "https://storingspace.blob.core.windows.net/fileupload-b/002715.txt",
              "filepath": "002715.txt",
              "chunk_id": "0"
            },
            [...]
          ],
          "intent": "["How to import x", "Importing x", "Steps to import x"]"
        }
      }
    }
  ],
  "created": 1720747501,
  "model": "gpt-4o",
  "object": "extensions.chat.completion",
  "system_fingerprint": "fp_abc28019ad",
  "usage": {
    "completion_tokens": 230,
    "prompt_tokens": 5480,
    "total_tokens": 5710
  }
}

2

Answers


  1. Azure OpenAI On Your Data API doesn’t have such kind of modifications to ai search only it gives you the results for the search query with citations based on the ai search data.

    To modify the fields or fields value you need to go with azure ai search api/sdk.

    Following document uses rest api to Add, Update or Delete Documents to ai search.

    So in your case the request is like below.

    {  
      "value": [  
        {  
          "@search.action": "merge",  
          "key_field_name": "unique_key_of_document", (key/value pair for key field from index schema)  
          "title": "your_custom_title" 
        },  
        ...  
      ]  
    }
    

    So, create a request body for all the unique keys with your title and update the documents via rest api.

    Login or Signup to reply.
  2. Using Python:

    from azure.core.credentials import AzureKeyCredential
    from azure.search.documents import SearchClient
    
    search_client = SearchClient(endpoint=service_endpoint,
                                 index_name=index_name, 
                                 credential=AzureKeyCredential(api_key))
    documents = []
    document = {"id": your_document_id,
                "title": "title",
               }
    documents.append(document)
    search_client.merge_or_upload_documents(documents=documents)
    

    The above merge/update the title field of current document based on the id. For multiple documents, append the documents list and update all documents at once!

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search