skip to Main Content

I am trying to create a custom visual transformation in a visual etl job in order to add a column to the output. The name of the column is something I would like to parameterize and pass at runtime using lambda function. Is it possible to pass runtime parameters to custom visual transform?

I created the transform itself with dummy values and it works fine, but I do not see any option to add runtime parameters to it.
The documentation and examples do not show either if there is an option to add runtime parameters to the config file.

TIA

2

Answers


  1. Chosen as BEST ANSWER

    Found a solution to this, basically we can get the arguments in the custom transform itself - args = getResolvedOptions(sys.argv, ["arg1", "arg2"]). Could not find a way to pass it at the time of call though.


  2. Actually, the AWS docs make it pretty clear on how to pass it from a job run using the SDK (or can use the Console too. There is an option when running the job in the console to specify additional runtime parameters).

    Suppose that you created a JobRun in a script, perhaps within a Lambda function:

    response = client.start_job_run(
                 JobName = 'my_test_Job',
                 Arguments = {
                   '--day_partition_key':   'partition_0',
                   '--hour_partition_key':  'partition_1',
                   '--day_partition_value':  day_partition_value,
                   '--hour_partition_value': hour_partition_value } )
    

    To retrieve the arguments that are passed, you can use the getResolvedOptions function as follows:

    import sys
    from awsglue.utils import getResolvedOptions
    
    args = getResolvedOptions(sys.argv,
                              ['JOB_NAME',
                               'day_partition_key',
                               'hour_partition_key',
                               'day_partition_value',
                               'hour_partition_value'])
    print "The day-partition key is: ", args['day_partition_key']
    print "and the day-partition value is: ", args['day_partition_value']
    

    Note that each of the arguments are defined as beginning with two hyphens, then referenced in the script without the hyphens. The arguments use only underscores, not hyphens. Your arguments need to follow this convention to be resolved.

    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search