skip to Main Content

I’m using Sagemaker Pipelines to chain together two consecutive ProcessingJobs. I’m getting a weird error when I call pipeline.upsert()

botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the CreatePipeline operation: Unable to parse pipeline definition. Property 'null' with value 'null' is not of expected type 'String'

This is what my pipeline looks like:

    step_process_data = ProcessingStep(
        name='ProcessDataStep',
        processor=script_processor,
        code=os.path.join(BASE_DIR, "scripts/preprocess.py"),
        job_arguments=job_arguments
    )
    
    step_split_data = ProcessingStep(
        name='SplitDataStep',
        processor=script_processor,
        code=os.path.join(BASE_DIR, "scripts/split_data.py"),
        job_arguments=job_arguments,
        depends_on=[step_process_data]
    )
    
    pipeline = Pipeline(
        name="DataPreperationPipeline",
        steps=[step_process_data, step_split_data],
        sagemaker_session=sagemaker_session
    )

Any thoughts on what I am doing wrong or missing?

2

Answers


  1. I ran into the same issue where my job_arguments were not all strings. I’d make sure all items in job_arguments are of the same type.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search