I’m using Sagemaker Pipelines to chain together two consecutive ProcessingJobs. I’m getting a weird error when I call pipeline.upsert()
botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the CreatePipeline operation: Unable to parse pipeline definition. Property 'null' with value 'null' is not of expected type 'String'
This is what my pipeline looks like:
step_process_data = ProcessingStep(
name='ProcessDataStep',
processor=script_processor,
code=os.path.join(BASE_DIR, "scripts/preprocess.py"),
job_arguments=job_arguments
)
step_split_data = ProcessingStep(
name='SplitDataStep',
processor=script_processor,
code=os.path.join(BASE_DIR, "scripts/split_data.py"),
job_arguments=job_arguments,
depends_on=[step_process_data]
)
pipeline = Pipeline(
name="DataPreperationPipeline",
steps=[step_process_data, step_split_data],
sagemaker_session=sagemaker_session
)
Any thoughts on what I am doing wrong or missing?
2
Answers
Not sure if you have all the objects are set up correctly can you please follow the below example and verify.
https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-pipelines/tabular/abalone_build_train_deploy/sagemaker-pipelines-preprocess-train-evaluate-batch-transform.ipynb
I ran into the same issue where my
job_arguments
were not all strings. I’d make sure all items injob_arguments
are of the same type.