I’m migrating our ML notebooks from Azure Databricks to AWS environment using Sagemaker and Step functions. I have separate notebooks for data processing, feature engineering and ML algorithms which I want to run in a sequence after completion of previous notebook. Can you help me any resource which shows to execute sagemaker notebooks in a sequence using AWS step?
Question posted in Amazon Web Sevices
The official Amazon Web Services documentation can be found here.
The official Amazon Web Services documentation can be found here.
2
Answers
For this type of architecture you need to involve some other elements of the aws as well.
The other services which might be helpful to achieve this is using the combination of eventbridge (scheduled rules) which will execute lambda and then reaches to sagemaker where you can execute you notebooks.
A new feature allows you to Operationalize your Amazon SageMaker Studio notebooks as scheduled notebook jobs. Unfortunately there no way yet to tie them together into a pipeline.
The other alternative would be to convert your notebooks to processing and training jobs, and use something like AWS Step Functions, or SageMaker Pipelines to run them as a pipeline.