skip to Main Content

I previously created a copy activity to copy all columns in sql table into ADLS in parquet format.
I am new to Azure data factory. So my question is how would copy only a specific set of columns into Azure data lake storage as Parquet?

3

Answers


    • In copy activity source settings, click the Query radio button in Use query option.
    • In Query text box, Type the query which selects only the required column.

    enter image description here

    • In sink, use the Parquet dataset.
      enter image description here

    • When copy activity is executed, only selected columns from SQL table gets copied to data lake.

    Preview Data of Source

    • When table is clicked in source dataset settings, all columns are selected.

    enter image description here

    • When Query option is clicked in source settings, selected columns from the query are displayed.

    enter image description here

    Login or Signup to reply.
  1. You can even restrict via mapping table and delete unnecessary fields, so that only the selected fields will get copied to the destination file or table of your interest.

    Login or Signup to reply.
  2. If you have a different table on each loop iteration, I assume the list of tables you’re parqueting is stored in a control table of some kind?

    So add the fields required from each table also to the control table.

    tableName fieldList
    customer id, surname, address
    product id, brand, sku

    Lookup from this table to obtain the table and field names, be sure to untick firstrowonly.

    enter image description here

    Pass the output into your ForEach loop:

    enter image description here

    enter image description here

    Finally access the table name and fields in the copy block query:

    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search