skip to Main Content

Does anyone know how to create Azure AI Search indexes for email .msg files?

I have been able to find sample indexes for JSON content but can’t seem to find samples that index email content.

I would like to be able to create an index based on the common email properties: From, To, CC, Subject, Sent Date, and body.

I believe it would be something like:

    {
        "name": "email-index",  
        "fields": [
            {"name": "From", "type": "Edm.String", "key": true, "filterable": true},
            {"name": "To", "type": "Collection(Edm.String)",
                "fields": [
                "address1",
                "address2"
                ]
            }
            {"name": "CC", "type": "Collection(Edm.String)",
                "fields": [
                "address1",
                "address2"
                ]
            }
            {"name": "BCC", "type": "Collection(Edm.String)",
                "fields": [
                "address1",
                "address2"
                ]
            }
            {"name": "DateSent", "type": "Edm.DateTimeOffset", "searchable": true, "filterable": false, "sortable": false, "facetable": false, "analyzer": "en.lucene"},
            {"name": "Body", "type": "Edm.String", "searchable": true, "filterable": true, "sortable": true, "facetable": true},
        ]
    }

I can’t fin samples for the .msg email fields to construct the index.

2

Answers


  1. Chosen as BEST ANSWER

    I was able to create an index and indexer that allows me to query based on the following fields: metadata_content_type metadata_message_from metadata_message_from_email metadata_message_to metadata_message_to_email metadata_message_cc metadata_message_cc_email metadata_message_bcc metadata_message_bcc_email metadata_creation_date metadata_last_modified metadata_subject

    https://learn.microsoft.com/en-us/azure/search/search-blob-metadata-properties


  2. The below code is for creating or updating or searching email data an index in Azure Cognitive Search using the Azure SDK for Python.

    import sys
    import json
    from azure.core.credentials import AzureKeyCredential
    from azure.search.documents import SearchClient
    from azure.search.documents.indexes import SearchIndexClient
    from azure.search.documents.indexes.models import SearchIndex, SimpleField, SearchableField
    
    # Azure Search service endpoint and admin key
    service_name = "YOUR_SEARCH_SERVICE_NAME"
    admin_key = "YOUR_SEARCH_SERVICE_ADMIN_API_KEY"
    endpoint = f"https://{service_name}.search.windows.net/"
    
    # Index name and schema
    index_name = "email-index2"
    
    # Define the fields for the schema
    fields = [
        SimpleField(name="EmailId", type="Edm.String", key=True, searchable=True),
        SimpleField(name="From", type="Edm.String", searchable=True, filterable=True),
        SimpleField(name="To", type="Collection(Edm.String)", searchable=True, filterable=True),
        SimpleField(name="CC", type="Collection(Edm.String)", searchable=True, filterable=True),
        SimpleField(name="BCC", type="Collection(Edm.String)", searchable=True, filterable=True),
        SimpleField(name="DateSent", type="Edm.String", searchable=True, filterable=True),
        SimpleField(name="Subject", type="Edm.String", searchable=True, filterable=True),
        SimpleField(name="Body", type="Edm.String", searchable=True, filterable=True)
    ]
    
    # Instantiate the SearchIndex object with the defined fields
    index = SearchIndex(name=index_name, fields=fields)
    
    # Instantiate the SearchIndexClient
    credential = AzureKeyCredential(admin_key)
    index_client = SearchIndexClient(endpoint=endpoint, credential=credential)
    
    # Create or update the index
    try:
        index_client.create_or_update_index(index=index)
        print(f"Index '{index_name}' created or updated successfully.")
    except Exception as e:
        print(f"An error occurred: {e}")
    
    

    enter image description here
    Supported data types of Azure AI Search

     {
        "name": "email-index",  
        "fields": [
            {"name": "EmailId", "type": "Edm.String", "key": true, "searchable": true},
            {"name": "From", "type": "Edm.String", "searchable": true, "filterable": true},
            {"name": "To", "type": "Collection(Edm.String)", "searchable": true, "filterable": true},
            {"name": "CC", "type": "Collection(Edm.String)", "searchable": true, "filterable": true},
            {"name": "BCC", "type": "Collection(Edm.String)", "searchable": true, "filterable": true},
            {"name": "DateSent", "type": "Edm.String", "searchable": true, "filterable": true},
            {"name": "Subject", "type": "Edm.String", "searchable": true, "filterable": true},
            {"name": "Body", "type": "Edm.String", "searchable": true, "filterable": true}
        ]
    }
    
    

    enter image description here

    enter image description here

    • Azure Files indexer with Azure AI Search

    In azure portal :

    enter image description here

    • Indexing file contents and metadata in Azure Cognitive Search.
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search