Amazon web services - Problem with deploying finetuned gemma in AWS sagemaker as an endpoint

I have finetuned a gemma 7b LLM from HuggingFace using Lora and stored the model as a compressed .tar.gz file.
I have finetuned locally in sagemaker.

finetuned_gemma/model-00004-of-00004.safetensors

finetuned_gemma/tokenizer_config.json

finetuned_gemma/model.safetensors.index.json

finetuned_gemma/config.json

finetuned_gemma/model-00002-of-00004.safetensors

finetuned_gemma/generation_config.json

finetuned_gemma/special_tokens_map.json

finetuned_gemma/model-00001-of-00004.safetensors

finetuned_gemma/tokenizer.json

finetuned_gemma/code/

finetuned_gemma/code/requirements.txt

finetuned_gemma/code/.ipynb_checkpoints/

finetuned_gemma/code/.ipynb_checkpoints/requirements-checkpoint.txt

finetuned_gemma/code/inference.py

finetuned_gemma/model-00003-of-00004.safetensors

The finetuned model is also stored in aws s3.

How do I now deploy the model as a sagemaker endpoint?

By the way I have used transformers version 4.38.0 as it is the minimum requirement for gemma tokenizer.

I want to know how to deploy it along with the image Uri.
Please help

I tried using sagemaker.huggingfacemodel and then tried deploying it but I’m facing lots of difficulties.

Answers

Please signup or login to give your own answer.