skip to Main Content

I have generated document analysis result using,

 with open(pdf_path, "rb") as f:
     poller = document_intelligence_client_sections.begin_analyze_document(
         "prebuilt-layout", f.read(), content_type="application/pdf", 
         output_content_format=ContentFormat.MARKDOWN, ) 
 result = poller.result()
 
 type(section_layout)
 >> azure.ai.documentintelligence.models._models.AnalyzeResult  # Want in this format! 

I have saved the result using …as_dict() as follows,

with open("data/section_layouts/result.json", "w") as f:
    json.dump(section_layout.as_dict(), f)

Now, as I load the json using,

with open("result.json", "r") as f:
    data = json.load(f)

I get the data in dictionary as expected. However, I wanted to have the data in in AnalyzeResult class format. Can anyone please help? Thank you.

More information:

I am using DocumentIntelligenceClient.

from azure.ai.documentintelligence import DocumentIntelligenceClient

document_intelligence_client_sections = DocumentIntelligenceClient(
    endpoint=service_endpoint, credential=default_credential)

with open(pdf_path, "rb") as f:
    poller = document_intelligence_client_sections.begin_analyze_document(
        "prebuilt-layout", f.read(), content_type="application/pdf", 
        output_content_format=ContentFormat.MARKDOWN,
)
result = poller.result()

So, the type of the object is different,

type(section_layout)
azure.ai.documentintelligence.models._models.AnalyzeResult

And it does not have .to_dict(), and from_dict(), as well.

2

Answers


  1. Chosen as BEST ANSWER

    The solution was rather simple.

    Importing AnalyzeResult from azure.ai.documentintelligence

    from azure.ai.documentintelligence.models import AnalyzeResult

    and the casting the dictionary previously saved with .as_dict()

    # Saving,
    with open("test.json", "w") as f:
        json.dump(section_layout.as_dict(), f)
    
    #Loading,
    with open("test.json", "r") as f:
        data = json.load(f)
    
    data = AnalyzeResult(data)
    

    I wish #Microsof #Azure #DocumentIntelligence team would have added this in their documentation!


  2. I get the data in dictionary as expected. However, I wanted to have the data in in AnalyzeResult class format. Can anyone please help? Thank you.

    You can use the below code to get the data in AnalyzeResult class format.

    Code:

    import json
    from azure.ai.formrecognizer import DocumentAnalysisClient
    from azure.core.credentials import AzureKeyCredential
    from azure.ai.formrecognizer import AnalyzeResult
    
    endpoint = "https://xxxx.cognitiveservices.azure.com/"
    api_key = "xxxxx"
    document_intelligence_client_sections = DocumentAnalysisClient(endpoint, AzureKeyCredential(api_key))
    
    pdf_path = r"C:Usersxxxdemo.pdf"
    
    with open(pdf_path, "rb") as f:
        poller = document_intelligence_client_sections.begin_analyze_document(
            model_id="prebuilt-layout", document=f.read(),
        )
        section_layout = poller.result()
    
    with open("result.json", "w") as f:
        json.dump(section_layout.to_dict(), f)  # Use to_dict() to save as JSON
        
    with open("result.json", "r") as f:
        data = json.load(f)
    
    # Convert the dictionary back to an AnalyzeResult object
    analyze_result = AnalyzeResult.from_dict(data)
    
    # Now you can work with the AnalyzeResult object
    print(type(analyze_result)) 
    

    The above code use the to_dict() method to save the AnalyzeResult object as a JSON file and then use the from_dict() method to convert the JSON data back to an AnalyzeResult object.

    Output:

    <class 'azure.ai.formrecognizer._models.AnalyzeResult'>
    

    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search