I’m using a copy activity in azure synapse pipeline to copy and filter data from
containerA/file1.csv to containerB/file2US.csv
Similarly I’m using another copy activity to copy and filter data from containerA/file1.csv to containerB/file2IND.csv
The same process for different regions. In every activity I add a where clause to filter the data and copy it into region specific files.
It feels pretty redundant to do this way. Is there any way where I can conditionally check each row and copy it to a different sink based on the region value?
What I’m trying to achieve is a SINGLE ACTIVITY that can select the correct sink based on a condition each row maps to.
2
Answers
The activity you are looking for is called Data Flows. You will use the Conditional Split transformation with as many sinks as you require to achieve this use case.
I would approach this using a For Each activity which runs in parallel and a parameterised Copy activity. You can use an array parameter to list the regions you want to loop through. Here’s an example with continents:
Set up your pipeline like this:
Use the Query in the Sink and and parameterise it with the
Add dynamic content
button:Alternately use a Stored Proc. Parameterise the Sink using a dataset parameter. This will give you control of the output filename and location.