skip to Main Content

Sorry for long post, I need to explain it properly for people to undertsand.

I have a pipeline in datafctory that triggers a published AML endpoint:
enter image description here

I am trying to parametrize this ADF pipeline so that I can deploy to test and prod, but on test and prod the aml endpoints are different.

Therefore, I have tried to edit the parameter configuration in ADF as shows here:
enter image description here

Here in the section Microsoft.DataFactory/factories/pipelines I add "*":"=" so that all the pipeline parameters are parametrized:

 "Microsoft.DataFactory/factories/pipelines": {
        "*": "="
    }

After this I export the template to see which parameters are there in json, there are lot of them but I do not see any paramter that has aml endpoint name as value, but I see the endpint ID is parametrized.

enter image description here

My question is: Is it possible to parametrize the AML endpoint by name? So that, when deploying ADF to test I can just provide the AML endpoint name and it can pick the id automatically:

enter image description here

3

Answers


  1. i faced the similar issue when deploying adf pipelines with ml between environments. Unfortunately, As of now, adf parameter file do not have ml pipeline name as parameter value. only turn around solution is modifiying the parameter file(json) file with aligns with your pipeline design. For example, i am triggering ml pipeline endpoint inside foreach activity–>if condition–>ml pipeline

    Here is my parameter file values:

    "Microsoft.DataFactory/factories/pipelines": {
        "properties": {
            "activities": [
                {
                    "typeProperties": {
                        "mlPipelineEndpointId": "=",
                        "url": {
                            "value": "="
                        },
                        "ifFalseActivities": [
                            {
                                "typeProperties": {
                                    "mlPipelineEndpointId": "="
                                }
                            }
                        ],
                        "ifTrueActivities": [
                            {
                                "typeProperties": {
                                    "mlPipelineEndpointId": "="
                                }
                            }
                        ],
                        "activities": [
                            {
                                "typeProperties": {
                                    "mlPipelineEndpointId": "=",
                                    "ifFalseActivities": [
                                        {
                                            "typeProperties": {
                                                "mlPipelineEndpointId": "=",
                                                "url": "="
                                            }
                                        }
                                    ],
                                    "ifTrueActivities": [
                                        {
                                            "typeProperties": {
                                                "mlPipelineEndpointId": "=",
                                                "url": "="
                                            }
                                        }
                                    ]
                                }
                            }
                        ]
                    }
                }
            ]
        }
    }
    

    after you export the ARM template, the json file has records for your ml endpoints

    "ADFPIPELINE_NAME_properties_1_typeProperties_1_typeProperties_0_typeProperties_mlPipelineEndpointId": {
            "value": "445xxxxx-xxxx-xxxxx-xxxxx"
    

    it is lot of manual effort to maintain if design is frequently changing so far worked for me. Hope this answers your question.

    Login or Signup to reply.
  2. I finally fixed this.

    The trick is to not chose Pipeline Endpoint ID but to choose Pipeline ID.

    Pipeline ID can be parametrized and I have set up this to come from a global parameter. Therefore I do not need to find the right level of identation everytime

    enter image description here

    Then:

    enter image description here

    Later you add the global parameters to your ARM template:

    enter image description here

    And in the parameter template you add:

    "Microsoft.DataFactory/factories": {
            "properties": {
                "globalParameters": {
                    "*": {
                        "value": "="
                    }
                },
                "globalConfigurations": {
                    "*": "="
                },
                "encryption": {
                    "*": "=",
                    "identity": {
                        "*": "="
                    }
                }
            }
    "Microsoft.DataFactory/factories/globalparameters": {
        "properties": {
            "*": {
                "value": "="
            }
        }
    }
    

    Finally I wrote a python CLI tool to get the latest pipeline ID for a given published pipeline id:

    import argparse
    from azureml.pipeline.core import PipelineEndpoint, PublishedPipeline, Pipeline
    from azureml.core import Workspace
    from env_variables import Env
    from manage_workspace import get_workspace
    
    
    def get_latest_published_endpoint(ws : Workspace, pipeline_name : str) -> str:
        """
        Get the latest published endpoint given a machine learning pipeline name.
        The function is used to update the pipeline id in ADF deploy pipeline
    
        Parameters
        ------
        ws : azureml.core.Workspace
            A workspace object to use to search for the models
        pipeline_name : str
            A string containing the pipeline name to retrieve the latest version
    
        Returns
        -------
        pipeline_name : azureml.pipeline.core.PipelineEndpoint
            The pipeline name to retrieve the last version
        """
        pipeline_endpoint = PipelineEndpoint.get(workspace=ws, name=pipeline_name)
        endpoint_id = pipeline_endpoint.get_pipeline().id # this gives back the pipeline id
        # pipeline_endpoint.id gives back the pipeline endpoint id which can not be set
        # as dynamic parameter in ADF in an easy way
    
        return endpoint_id
    
    if __name__ == "__main__":
        parser = argparse.ArgumentParser()
        parser.add_argument("--monitoring_pipeline_name", type=str,
                            help="Pipeline Name to get endpoint id",
                            default='yourmonitoringpipeline')
        parser.add_argument("--training_pipeline_name", type=str,
                            help="Pipeline Name to get endpoint id",
                            default='yourtrainingpipeline')
        parser.add_argument("--scoring_pipeline_name", type=str,
                            help="Pipeline Name to get endpoint id",
                            default='yourscoringpipeline')
        args, _ = parser.parse_known_args()
        e = Env()
    
        ws = get_workspace(e.workspace_name, e.subscription_id, e.resource_group)  # type: ignore
        latest_monitoring_endpoint = get_latest_published_endpoint(ws, pipeline_name=args.monitoring_pipeline_name)  # type: ignore
        latest_training_endpoint = get_latest_published_endpoint(ws, pipeline_name=args.training_pipeline_name) # type: ignore
        latest_scoring_endpoint = get_latest_published_endpoint(ws, pipeline_name=args.scoring_pipeline_name) # type: ignore
        print('##vso[task.setvariable variable=MONITORING_PIPELINE_ID;]%s' % (latest_monitoring_endpoint))
        print('##vso[task.setvariable variable=TRAINING_PIPELINE_ID;]%s' % (latest_training_endpoint))
        print('##vso[task.setvariable variable=SCORING_PIPELINE_ID;]%s' % (latest_scoring_endpoint))
    

    By printing the variables in these way they are added to environment variables that later I can pick in the ARM deploy step:

    enter image description here

    And then we have our desired setup:

    enter image description here

    Different pipeline IDs for different environments.

    Maybe material for a blog post as it works like charm.

    Login or Signup to reply.
  3. Making changes to ADF(ARMTemplateForFactory.json) or Synapse(TemplateForWorkspace.json) inside DevOps CI/CD pipeline

    Sometimes parameters are not automatically added to parameter file i.e ARMTemplateParametersForFactory.json/TemplateParametersForWorkspace.json, for example MLPipelineEndpointId.
    In case of ML pipeline you can use PipelineId as parameter ,but can change every time ML pipeline is updated.

    You can solve this issue by replacing the value in ADF(ARMTemplateForFactory.json) or Synapse(TemplateForWorkspace.json), using Azure Powershell.
    Idea is simple, you use powershell to open the ArmTemplate and replace the value based upon the env and it works exactly like overwriting parameters within DevOps.

    This editing is done on the fly i.e the devOps artifact is updated and not the repo file, the ADF/Synapse repository won’t change..just like how it’s done while over writting parameters.

    Issue
    We currently have two environments for Synapse called bla-bla-dev and bla-bla-test. Now dev synapse environment is using dev machine learning environment and test synapse environment is using test ML environment.
    But the MLPipelineEndpointId is grayed out on dev synapse and the parameter is not present in parameter file so it can’t be overwritten normally.

    enter image description here

    Solution
    Use Azure Powershell to run below command:-

    (Get-Content $(System.DefaultWorkingDirectory)/Artifacts_source/bla-bla-dev/TemplateForWorkspace.json).Replace($(scoringMLPipelineEndPointDev), $(scoringMLPipelineEndPoint)) | Set-Content $(System.DefaultWorkingDirectory)/Artifacts_source/bla-bla-dev/TemplateForWorkspace.json
    
    • $(System.DefaultWorkingDirectory) = This points to release pipelines artifacts which are based on the armtemplate repository.
    • $(scoringMLPipelineEndPointDev) = The value you would like to be replace.
    • $(scoringMLPipelineEndPoint) = The value that will be replacing dev parameter value

    Steps

    1. Create
      devOps pipeline variable one for dev environment (One to be replaced) and then another one for test environment (Test MLPipelineEndpointId for test synapse pipeline).

    enter image description here

    1. Add Azure Powershell step in ADF/Synapse release devOps pipeline.
      This CI/CD has to be placed before arm template deployment step.

      (Get-Content $(System.DefaultWorkingDirectory)/Artifacts_source/bla-bla-dev/TemplateForWorkspace.json).Replace($(scoringMLPipelineEndPointDev), $(scoringMLPipelineEndPoint)) | Set-Content $(System.DefaultWorkingDirectory)/Artifacts_source/bla-bla-dev/TemplateForWorkspace.json

    enter image description here
    Once deployment you will see that you test environment is pointing to test MLpipelineEndpoinId.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search