skip to Main Content

Inside a Azure synapse pipeline, I need to find and compare the row count of the CSV file with summary file stored in Azure Blob storage.

If the comparison fails then pipeline will stop, if the counts match then can proceed ahead in the pipeline.

pipeline, which components

2

Answers


  1. Chosen as BEST ANSWER

    Given my CSV is more than 10 Mill rows and in Synapse, I also found an alternative: Azure databricks notebook call from azure data factory based on if/else flag


  2. If your rows are less than 5000, you can use lookup activities for this requirement which will give the required row count of the files.

    Create datasets for your files and give those to two lookup activities like below. Make sure you uncheck the First row only option in both activities.

    enter image description here

    Now, take an if activity and use the below expression to compare both row counts.

    @equals(activity('count of control').output.count,activity('count of csv').output.count)
    

    enter image description here

    If the condition satisfies, you can add the further activities in the True activities of if activity. Otherwise, you can stop the pipeline by not giving any activity in the False activities of if activity.

    If the row count in your control file is like below sample.

    filename,count
    one.csv,5
    

    Then, compare the count property from the control file lookup activity output array with the CSV file row count.

    And, if the CSV file count is more than 5000, you can use ADF dataflow for this.

    Get the row count of CSV file using below transformations in the Dataflow.

    • Source (csvfile)
    • DerivedColumn -> create new column rowcount and give some value.
    • Aggregate – count(rowcount)
    • sink – Use sink cache to pass this row count value to the pipeline.

    After dataflow activity, compare the row count of the CSV file with your control file count using if activity.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search