I am trying to implement a pipeline that each step writes a file in a blob container.
Steps:
- pdf triggers a function that extracts the text and saves it as txt in the container
- the extracted text triggers a function that polishes the result and write a third file in the container
The second step sometime (most of them) fails because the stream of the txt file generated by the first step looks empty (it’s not).
I don’t have any budget so i can’t use service bus or event hubs in Azure. I think that the triggers starts via polling.
Anyone has an idea on how can i solve this problem or where i should start to look?
2
Answers
The problem seems to be the time needed by the Azure function to execute.
As soon as the function is invoked, the output binding creates the file but just at the end of the execution writes the data (which makes sense).
On the other side, if the output file of this function triggers another azure function, this second pipeline stages can get an empty file or not, depending on the execution status of the first one. (i hope i could explain myself a bit)
My solution was not using the output blob binding. If I use the ContainerClient binding I can upload a file all at once when the elaboration of my data is done.
After reproducing from my end, this was working fine when I followed the below process. I have created 2 Blob Trigger functions where one is to convert pdf to txt file and the other is to read the txt file, do the required manipulations and then saving to another container.
Function1.cs – Reading the pdf file and saving it as text file in a container
Results:
Function2.cs – Reading the uploaded text file and doing some manipulations and saving it to another container
Results: