skip to Main Content
  • I have finetuned a gemma 7b LLM from HuggingFace using Lora and stored the model as a compressed .tar.gz file.
  • I have finetuned locally in sagemaker.

  • this is my .tar.gz file structure of finetuned model :

finetuned_gemma/model-00004-of-00004.safetensors

finetuned_gemma/tokenizer_config.json

finetuned_gemma/model.safetensors.index.json

finetuned_gemma/config.json

finetuned_gemma/model-00002-of-00004.safetensors

finetuned_gemma/generation_config.json

finetuned_gemma/special_tokens_map.json

finetuned_gemma/model-00001-of-00004.safetensors

finetuned_gemma/tokenizer.json

finetuned_gemma/code/

finetuned_gemma/code/requirements.txt

finetuned_gemma/code/.ipynb_checkpoints/

finetuned_gemma/code/.ipynb_checkpoints/requirements-checkpoint.txt

finetuned_gemma/code/inference.py

finetuned_gemma/model-00003-of-00004.safetensors


The finetuned model is also stored in aws s3.

How do I now deploy the model as a sagemaker endpoint?

By the way I have used transformers version 4.38.0 as it is the minimum requirement for gemma tokenizer.

I want to know how to deploy it along with the image Uri.
Please help

I tried using sagemaker.huggingfacemodel and then tried deploying it but I’m facing lots of difficulties.

2

Answers


Please signup or login to give your own answer.
Back To Top
Search