How to check if files present under a data lake directory are empty using Azure Data Factory?
There are multiple files present in the data lake directory and I want to check if these files are empty or not, If files are empty then I want to store the filenames of these empty files in a CSV File.
2
Answers
I have followed this thread Azure Data Factory V2 Check file size for Copy Activity and applied if condition to judge the file size and append variable and set variable to store the file names, Worked for me.
If you want to check whether the file is empty or not Please follow the below Steps. I tried to reproduce the same in my environment and I got the below results:
In my storage account, I have two files one is
demo123.csv
empty and another one is avm_name3.csv
non-empty file.Please follow these Steps:
Step1: First Create Get metadata with child items
Step 2: Add dynamic expression
@activity('Get Metadata1').output.childItems
on the forEach activity.Step3 : Inside forEach activity ->use lookup and If condition. Add this dynamic expression on lookup activity :
@item().name
Using this dynamic expression
@equals(activity('Lookup1').output. count,0)
.You will know whether a file is empty or not.Then add append variable inside True conduction.
I created two variables with array type.
@item().name
on append variableStep4:
Add set variable to the forEach activity:
After successful execution of pipeline. I got empty file name:
If you want to store the filenames of these empty files in a CSV File .Then follow this SO thread by Aswin