skip to Main Content

I have an SQL table called FileGroups. I will query for certain columns such as Id, Name and Version. In my Azure Data Factory pipeline, I want to map the column names and row values from the SQL query results to key-value pairs for a JSON string. I also need to include a couple pipeline parameters in the JSON string. I will then pass this JSON string as input for a stored procedure at the end of my pipeline.

The resulting JSON string will look like this:

{
  "id": "guid1",
  "name": "fileGroup1",
  "version": 1.0,
  "pipeline_param_1": "value1",
  "pipeline_param_2": "value2"
},
{
  "id": "guid2",
  "name": "fileGroup2",
  "version": 2.0,
  "pipeline_param_1": "value1",
  "pipeline_param_2": "value2"
}

How do I query the SQL table and construct this JSON string all within my ADF pipeline? What activities or data flow transformations do I need to achieve this?

2

Answers


  1. the easiest way to implement it is by using a "copy activity"

    Here is a quick demo that i created, i want to transform SQL data into Json, i copied SalesLT.Customer data from sql sample data

    1. created SQL database with sample data in azure portal.
    2. In azure data factory, i added the database as a dataset.
    3. created a pipeline and i named it "mapSQLDataToJSON"
    4. in the pipeline , i added a "Copy activity"
    5. in copy activity, i added the sql db as a source dataset and added a query option , Query : "@concat(‘select CustomerID,Title, pipeId= ”’, pipeline().RunId,”’ from SalesLT.Customer’)"

    here you can select the columns that you need and add to the data a new column like i did , added a new column "pipId" and used pipeline params.

    1. in copy activity i added the blob storage as a sink and data type to be "Json"
    2. tested connections and triggered the pipeline
    3. i opened the blob storage , and i clicked on the copied Json data , and it worked.

    Copy activity in ADF:
    enter image description here

    Data in blob storage:
    enter image description here

    you can read here about copy activity and pipeline params , links:

    https://learn.microsoft.com/en-us/azure/data-factory/control-flow-system-variables

    https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-overview

    Login or Signup to reply.
  2. If your source database is a Microsoft SQL database, like Azure SQL DB, Sql Server, Managed Instance, Azure Synapse Analytics etc, then it is quite capable manipulating JSON. The FOR JSON clause constructs valid JSON and you can use options like WITHOUT_ARRAY_WRAPPER to produce clean output.

    A simple example:

    DROP TABLE IF EXISTS #tmp;
    
    CREATE TABLE #tmp (
        id                  VARCHAR(10) NOT NULL,
        [name]              VARCHAR(20) NOT NULL,
        [version]           VARCHAR(5) NOT NULL,
        pipeline_param_1    VARCHAR(20) NOT NULL,
        pipeline_param_2    VARCHAR(20) NOT NULL
    );
    
    INSERT INTO #tmp VALUES
        ( 'guid1', 'fileGroup1', '1.0', 'value1.1', 'value1.2' ),
        ( 'guid2', 'fileGroup2', '2.0', 'value2.1', 'value2.2' )
    
    SELECT *
    FROM #tmp
    FOR JSON PATH, WITHOUT_ARRAY_WRAPPER;
    

    Sample output:

    json output

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search