skip to Main Content

I successfully tested a python azure function locally that once triggered in blob storage, converts any .xlsb files to .xlsx files. On a <100KB .xlsb file this took about a minute, and a ~70,000KB .xlsb file took 15 minutes (that feels very long! Also looking for performance enhancement here if you have!). The function code is below.

import logging
import pandas as pd
from io import BytesIO
import azure.functions as func
from azure.storage.blob import BlobServiceClient
 
app = func.FunctionApp()
 
@app.blob_trigger(arg_name="myblob", path="raw/Input/{name}.xlsb",
                               connection="")
def blob_trigger(myblob: func.InputStream):
    logging.info(f"Python blob trigger function processed blob"
                f"Name: {myblob.name}"
                f"Blob Size: {myblob.length} bytes")
   
    accountName = ""
    accountKey = ""
    connectionString = f"DefaultEndpointsProtocol=https;AccountName={accountName};AccountKey={accountKey};EndpointSuffix=core.windows.net"
   
    inputBlobname = myblob.name.replace("raw/Input/", "")
    containerName = "raw/Output"
    outputBlobname = inputBlobname.replace(".xlsb", ".xlsx")
 
    blob_service_client = BlobServiceClient.from_connection_string(connectionString)
    container_client = blob_service_client.get_container_client(containerName)
 
    input_data = myblob.read()
    df = pd.read_excel(BytesIO(input_data), engine="pyxlsb")
 
    output_data = BytesIO()
    df.to_excel(output_data, index=False)
    output_data.seek(0)
 
    bob_client = container_client.get_blob_client(outputBlobname)
    bob_client.upload_blob(output_data.getvalue(), overwrite=True)

I have deployed the function successfully to Azure from VS code and tested it successfully with the smaller .xlsb file. However when I try with the larger file, nothing happens, not even a failed invocation. I assume it times out, but I never see a failed invocation, can anyone explain why?Image is azure function invocations, showing 4 success and 0 failures in last 30 days

2

Answers


  1. Chosen as BEST ANSWER

    This is not really an answer, but after 32mins and using a 70,000+ KB file, I have received a failed invocation. The error was a timeout, here is the message: "Timeout value of 00:30:00 exceeded by function 'Functions.blob_trigger'. Initiating cancellation."


  2. The issue could be due to size of the .xlsb file.

    • Check if the file is available in raw/output folder in the container.
    • Validate the storage connection string using in the function app, upload the file again to the container and wait for some time to reflect the error details under function invocations.

    I have tried the same in my environment with your code:

    blob_service_client = BlobServiceClient.from_connection_string(connectionString)
        container_client = blob_service_client.get_container_client(containerName)
    
        input_data = myblob.read()
        df = pd.read_excel(BytesIO(input_data), engine="pyxlsb")
    
        output_data = BytesIO()
        df.to_excel(output_data, index=False)
        output_data.seek(0)
    
        blob_client = container_client.get_blob_client(output_blobname)
        blob_client.upload_blob(output_data, overwrite=True)
    
        logging.info(f"Processed and uploaded blob: {output_blobname}")
    
    • Initially, uploaded .xslb with 8.55 KB and able to trigger the function.

    enter image description here

    • File got converted to .xlsx and uploaded to raw/output folder.

    enter image description here

    • Uploaded the .xlsb file with 70000+ KB size to the Azure Blob Storage Container:

    enter image description here

    I am able to see the error invocations in Portal under Function App=>Invocations:

    enter image description here

    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search