When trying to use Huggingface estimator on sagemaker, Run training on Amazon SageMaker e.g.
# create the Estimator
huggingface_estimator = HuggingFace(
entry_point='train.py',
source_dir='./scripts',
instance_type='ml.p3.2xlarge',
instance_count=1,
role=role,
transformers_version='4.17',
pytorch_version='1.10',
py_version='py38',
hyperparameters = hyperparameters
)
When I tried to increase the version to transformers_version=’4.24′, it throws an error where the maximum version supported is 4.17.
How to use AWS Sagemaker with newer version of Huggingface Estimator?
There’s a note on using newer version for inference on https://discuss.huggingface.co/t/deploying-open-ais-whisper-on-sagemaker/24761/9 but it looks like the way to use it for training with the Huggingface estimator is kind of complicated https://discuss.huggingface.co/t/huggingface-pytorch-versions-on-sagemaker/26315/5?u=alvations and it’s not confirmed that the complicated steps can work.
3
Answers
You can use the Pytorch estimator and in your source directory place a requirements.txt with Transformers added to it. This will ensure 2 things
To achieve this you need to structure your source directory like this
scripts
/train.py
/requirements.txt
and pass the source_dir attribute to the pytorch estimator
@alvas,
Amazon SageMaker is a managed service, which means AWS builds and operates the tooling for you, saving your time. In your case, the tooling of interest is an integration of a new version of HuggingFace Transformers library with SageMaker that should be developed, tested and deployed to production. So, this integration is naturally expected to be one or few versions behind the upstream library. But as a benefit, you always get a version of Transformers that is proved to be stable and compatible with SageMaker.
In your case, you want to try the latest version of Transformers in SageMaker, potentially sacrificing the stability and compatibility (v4.24 was released just less than a month ago). As you correctly mentioned, this workflow can be "kind of complicated" and "not confirmed that the complicated steps can work". @Arun Lokanatha suggested the easiest way to try the new version. Indeed, Transformers work with regular PyTorch estimator, but instead of high-level HuggingFace estimator API you now need to use the lower-level PyTorch estimator API. The above-mentioned
requirements.txt
will look like this:As a drawback, you need to do a little bit more work by yourself, e.g. to figure out what is the minimal version of PyTorch/CUDA libraries required etc. And you’re responsible for testing, securing, and optimizing the integration as appropriate for production grade use, potentially loosing some benefits from utilising SageMaker at its full capability.
If you finally decide to use HuggingFace high-level estimator in production after my explanation, I recommend to take at least these actions:
I hope this answer is helpful.
Ivan
You can achieve this by
Step-1 : Create a custom ECR Image with required hf version (https://docs.aws.amazon.com/sagemaker/latest/dg/studio-byoi.html)
Step-2 : Develop your Train.py
Step-3 : : Pass train.py and the new ecr image uri to sagemaker.estimator.
(https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html)