I imported some text files into Azure OpenAI:
After the import, I see a "title" field used for search:
which I can’t edit via UI as it’s greyed out:
How can I define the title for each document? For example, does the Azure OpenAI On Your Data API allow me to define the title for each document?
By default, titles are prepopulated via automated summarization (which seems to be simply truncation?). I can see some titles e.g. via:
import os
import pprint
from openai import AzureOpenAI
#from azure.identity import DefaultAzureCredential, get_bearer_token_provider
endpoint = os.getenv("ENDPOINT_URL", "https://[redacted].openai.azure.com/")
deployment = os.getenv("DEPLOYMENT_NAME", "[redacted GPT engine name]")
search_endpoint = os.getenv("SEARCH_ENDPOINT", "https://[redacted].search.windows.net")
search_key = os.getenv("SEARCH_KEY", "[redacted key]")
search_index = os.getenv("SEARCH_INDEX_NAME", "[redacted]")
# token_provider = get_bearer_token_provider(
# DefaultAzureCredential(),
# "https://cognitiveservices.azure.com/.default")
client = AzureOpenAI(
azure_endpoint=endpoint,
api_version="2024-05-01-preview",
api_key='[redacted key]'
)
# azure_ad_token_provider=token_provider,
completion = client.chat.completions.create(
model=deployment,
messages=[
{
"role": "user",
"content": "How can I sort a Python list?"
}],
max_tokens=800,
temperature=0,
top_p=1,
frequency_penalty=0,
presence_penalty=0,
stop=None,
stream=False,
extra_body={
"data_sources": [{
"type": "azure_search",
"parameters": {
"endpoint": f"{search_endpoint}",
"index_name": "[redacted]",
"semantic_configuration": "default",
"query_type": "vector_semantic_hybrid",
"fields_mapping": {},
"in_scope": True,
"role_information": "You are an AI assistant that helps people find information.",
"filter": None,
"strictness": 5,
"top_n_documents": 10,
"authentication": {
"type": "api_key",
"key": f"{search_key}"
},
"embedding_dependency": {
"type": "deployment_name",
"deployment_name": "[redacted]"
}
}
}]
}
)
print(completion.to_json())
outputs:
{
"id": "7eb67d03-3868-46fe-8cb1-fdf821c633be",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "To acquire [...].",
"role": "assistant",
"end_turn": true,
"context": {
"citations": [
{
"content": "You can copy to your computer [...]",
"title": "You can copy [...]",
"url": "https://[redacted].blob.core.windows.net/fileupload-he/920.txt",
"filepath": "000920.txt",
"chunk_id": "0"
},
{
"content": "Dornone of the following:rnChoose File > Automate [...]",
"title": "Do",
"url": "https://storingspace.blob.core.windows.net/fileupload-b/002715.txt",
"filepath": "002715.txt",
"chunk_id": "0"
},
[...]
],
"intent": "["How to import x", "Importing x", "Steps to import x"]"
}
}
}
],
"created": 1720747501,
"model": "gpt-4o",
"object": "extensions.chat.completion",
"system_fingerprint": "fp_abc28019ad",
"usage": {
"completion_tokens": 230,
"prompt_tokens": 5480,
"total_tokens": 5710
}
}
2
Answers
Azure OpenAI On Your Data API doesn’t have such kind of modifications to ai search only it gives you the results for the search query with citations based on the ai search data.
To modify the fields or fields value you need to go with azure ai search api/sdk.
Following document uses rest api to Add, Update or Delete Documents to ai search.
So in your case the request is like below.
So, create a request body for all the unique keys with your title and update the documents via rest api.
Using Python:
The above merge/update the
title
field of current document based on theid
. For multiple documents,append
thedocuments
list and update all documents at once!