We have a source table with 1M records. Source, Target DB is Azure DB for SQL Server.
We do not want all 1 M rows in target database. Target table works as a cache table for us. Every fortnight, we want to update target table with updated values from source table, only for record id which are present in this target table.
Can Azure Data Factory handle this?
2
Answers
You can use dataflow and join transformation to achieve this.
https://learn.microsoft.com/en-us/azure/data-factory/data-flow-join
Use inner join to select the values from source and sink into the destination
You can either use the Copy activity to copy the data into a staging table and then call a Stored Process activity to merge the two eg based on a primary key OR use the Copy activity with the stored proc option in the Sink.
The second is more complicated but less steps in the process. The first at least gives you a point to restart from and an audit trail.