I am copying data from ADLS csv file to Azure SQL DB.
File contains 9 columns and mapping is done according to that, but sometimes the source file is only having 6 or 7 columns. In that case the pipeline is failing with error
The name of column index ‘x’ is empty. Make sure column name is properly specified in the header row
Even some columns are missing I want to copy other columns data which is available and ignore the missing columns.
How can I dynamically handle this case and do mapping in case some columns are missing? As the files from source will have different set of columns each time.
2
Answers
You can remove the mapping from the copy activity so it will try to map automatically.
Note: the column order in csv file should match the column order in table
I created pipeline and performed get metadata activity with
Column count
field to get the number of columns of my csv file which is in ADLS account by selecting the ADLS dataset. After successful execution of metadata activity connected with lookup activity. selected sql database dataset for lookup and executed below code to find the number of columns of my sql table.After successful execution of look up connected to if condition and given below expression to compare the columns of meta data activity and lookup activity
If it is true I implemented copy activity to copy data from adls to sql by following below procedure
Selected delimited text dataset by selecting csv file from adls linked service as source enable first row as header in dataset
and select sql database dataset as sink and imported the schema in mapping of copy activity.
If it is false I implemented dynamic mapping following below procedure:
I created table in sql database with below columns
Inserted below values
In jsonMapping column inserted columns of source and columns of table where the values need to store with above format.
I implemented lookup activity to by selecting sql database dataset and entered below query
After successful execution of lookup created connected to copy activity. selected adls delimited text as source and sql database dataset as sink and created two parameters in sink named schema with
@activity('Lookup1').output.firstRow.sinkTableSchema
value and table with@activity('Lookup1').output.firstRow.sinkTableName
and used those parameters for selecting the table in dataset as
@dataset().schema
.@dataset().table
I added mapping dynamically using@json(activity('Lookup1').output.firstRow.jsonMapping)
executed the pipeline, it executed successfully.
Here is my Json of pipeline:
If there are multiple files then perform above activities in foreach loop.