skip to Main Content

I took 2 lookups to read 2 excel files and connected to 2 for each activities and inside i have taken 2 set variable and given @item.id
Now i have to eliminate the matching records and only unmatched records has to store in a new variable

I have tried if condition and given the expression if(equals(setvariable1,setvariable2),null,setvariable1)

But its not working
1st table

Name  id
Ganesh 1
Vivek  2
Lah    3

2nd table

Name  id
Ganesh 1
Vivek  2
Man    3

Required output

Man  3
Lah  3

2

Answers


    1. Add 2 source datasets one for each of the data
    2. Add derived column transformation to each of the sources and create a new column called ‘hashing’ with the expression: "crc32(Name,Id)"
    3. Add Assert transformation to both the branches and select assert type as ‘Expect exists’ and provide ‘hashing1’==’hashing2’ in the expression for the first assert and ‘hashing2’==’hashing1’ in the first assert
    4. Add Alter row transformation and set condition as ‘Delete if’ = not(hasError('assert1')) and not(hasError('assert2')) respectively in both the branches
    5. Add Select transformation in both the branches and from first branch remove all columns except : ‘source1@Name’ and ‘source1@Id’
      From second branch, remove all columns except ‘source2@Name’ and ‘source2@Id’
    6. Add Union Transformation to union both the branches
    7. Add Sink transformation to load the output to destination file/table

    enter image description here

    {
    "name": "dataflow12",
    "properties": {
        "type": "MappingDataFlow",
        "typeProperties": {
            "sources": [
                {
                    "dataset": {
                        "referenceName": "DelimitedText21",
                        "type": "DatasetReference"
                    },
                    "name": "source1"
                },
                {
                    "dataset": {
                        "referenceName": "DelimitedText22",
                        "type": "DatasetReference"
                    },
                    "name": "source2"
                }
            ],
            "sinks": [
                {
                    "dataset": {
                        "referenceName": "AzureSqlTable1",
                        "type": "DatasetReference"
                    },
                    "name": "sink1"
                }
            ],
            "transformations": [
                {
                    "name": "derivedColumn1"
                },
                {
                    "name": "derivedColumn2"
                },
                {
                    "name": "assert1"
                },
                {
                    "name": "assert2"
                },
                {
                    "name": "alterRow1"
                },
                {
                    "name": "alterRow2"
                },
                {
                    "name": "select1"
                },
                {
                    "name": "select2"
                },
                {
                    "name": "union1"
                }
            ],
            "scriptLines": [
                "source(output(",
                "          Name as string,",
                "          Id as string",
                "     ),",
                "     allowSchemaDrift: true,",
                "     validateSchema: false,",
                "     ignoreNoFilesFound: false) ~> source1",
                "source(output(",
                "          Name as string,",
                "          Id as string",
                "     ),",
                "     allowSchemaDrift: true,",
                "     validateSchema: false,",
                "     ignoreNoFilesFound: false) ~> source2",
                "source1 derive(hashing1 = crc32(Name,Id)) ~> derivedColumn1",
                "source2 derive(hashing2 = crc32(Name,Id)) ~> derivedColumn2",
                "derivedColumn1, derivedColumn2 assert(expectExists(hashing1 == hashing2, false, 'assert1')) ~> assert1",
                "derivedColumn2, derivedColumn1 assert(expectExists(hashing2 == hashing1, false, 'assert2')) ~> assert2",
                "assert1 alterRow(deleteIf(not(hasError('assert1')))) ~> alterRow1",
                "assert2 alterRow(deleteIf(not(hasError('assert2')))) ~> alterRow2",
                "alterRow1 select(mapColumn(",
                "          Name = source1@Name,",
                "          Id = source1@Id",
                "     ),",
                "     skipDuplicateMapInputs: true,",
                "     skipDuplicateMapOutputs: true) ~> select1",
                "alterRow2 select(mapColumn(",
                "          Name = source2@Name,",
                "          Id = source2@Id",
                "     ),",
                "     skipDuplicateMapInputs: true,",
                "     skipDuplicateMapOutputs: true) ~> select2",
                "select1, select2 union(byName: true)~> union1",
                "union1 sink(allowSchemaDrift: true,",
                "     validateSchema: false,",
                "     deletable:false,",
                "     insertable:true,",
                "     updateable:false,",
                "     upsertable:false,",
                "     format: 'table',",
                "     skipDuplicateMapInputs: true,",
                "     skipDuplicateMapOutputs: true,",
                "     errorHandlingOption: 'stopOnFirstError') ~> sink1"
            ]
        }
    }
    

    }

    Login or Signup to reply.
  1. To Fetch the Missing records in azure data factory you need to use filter activity .

    Follow the below steps:

    • Take 2 LookUps each of a on excel file as below:
      enter image description here

    • After Each lookup Take Filter activity to get Missing record from that file.
      with below expressions:

    --For table 1
    items - @activity('Lookup1').output.value
    Condition - @not(contains(activity('Lookup2').output.value,item()))
    
    --For table 2
    items - @activity('Lookup2').output.value
    Condition - @not(contains(activity('Lookup1').output.value,item()))
    

    enter image description here
    This will filter out the missing records.

    • Now take variable with array type and expression as @union(activity('Filter_for_table1').output.value,activity('Filter_for_table2').output.value)
      enter image description here
      This will result you expected output as below:

    Output:

    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search