skip to Main Content

As the title says I have a starter function that is blob triggered and I’m looking to send the blob that triggered the starter through to the activity functions as an input. I have the option to pass in an input using the .start_new() method however these inputs must be JSON serializable which the blob file (func.InputStream) is not.

I have tried to decode the InputStream object (myblob) which allows me to pass it as an input but I lose some of the basic functionality I would have when it’s an InputStream object and also need to ensure I’m able to pass it into the subsquent cognitive services that will be called in the activity functions. See below for code of the starter function.

Starter Function

functions.json
`

{
  "scriptFile": "__init__.py",
  "bindings": [
    {
      "name": "myblob",
      "type": "blobTrigger",
      "direction": "in",
      "path": "container/{name}",
      "connection": "conn str"
    },
    {
      "name": "$return",
      "type": "blob",
      "path": "container/{name}",
      "connection": "AzureWebJobsStorage",
      "direction": "out"
    },
    {
      "name": "starter",
      "type": "durableClient",
      "direction": "in"
    }
  ]
}

`
init.py

async def main(myblob: func.InputStream, starter: str) -> func.InputStream:
    logging.info(f"Python blob trigger function processed blob n"
                 f"Name: {myblob.name}n"
                 f"Blob Size: {myblob.length} bytesnn")        
    client = df.DurableOrchestrationClient(starter)
    instance_id = await client.start_new('Orchestrator', client_input=myblob) ##Client_input must be json serializable
    logging.info(f"Started orchestration with ID = '{instance_id}'.")
    return myblob

Which Returns:
Exception: TypeError: class <class ‘azure.functions.blob.InputStream’> does not expose a to_json function

2

Answers


  1. Chosen as BEST ANSWER

    Had more issues with using the blobtrigger so switched to an event grid trigger instead. Would pass the name of the blob which triggered the orchestration to the activity function (retrieved from the event data) and used the azure sdk to download the file to memory instead which worked quite well.

    Reasons for moving away from a blob trigger:

    1. High latency, standard blob trigger is not actually event driven and relies on polling
    2. For similar reasons logs are not guaranteed
    3. When dealing with large files (our use case required an upper limit of 2.5gb) a blob trigger can be compute intensive and at times the trigger will fail due to the size/length of the blob

    Some of this information is presented in the documentation linked below:

    https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob-trigger?pivots=programming-language-python&tabs=python-v2%2Cin-process


  2. Activity Functions can have other regular input and output bindings as well. So, you could pass in the details of the blob as input to the activity function and use those details to fetch the blob using the blob input binding.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search