skip to Main Content

I’ve followed this tutorial (colab notebook) in order to finetune my model.

Trying to load my locally saved model

model = AutoModelForCausalLM.from_pretrained("finetuned_model")

yields Killed.


Trying to load model from hub:

yields

import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

peft_model_id = "lucas0/empath-llama-7b"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained(cwd+"/tokenizer.model")

# Load the Lora model
model = PeftModel.from_pretrained(model, peft_model_id)

yields

AttributeError: /home/ubuntu/empath/lora/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cget_col_row_stats

full stacktrace

Model Creation:

I have finetuned a model using PEFT and LoRa:

model = AutoModelForCausalLM.from_pretrained(
"decapoda-research/llama-7b-hf",
torch_dtype=torch.float16,
device_map='auto',
)

I had to download and manually specify the llama tokenizer.

tokenizer = LlamaTokenizer(cwd+"/tokenizer.model")
tokenizer.pad_token = tokenizer.eos_token

to the training:

from peft import LoraConfig, get_peft_model

config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

model = get_peft_model(model, config)

data = pd.read_csv("my_csv.csv")
dataset = Dataset.from_pandas(data)
tokenized_dataset = dataset.map(lambda samples: tokenizer(samples["text"]))

trainer = transformers.Trainer(
    model=model,
    train_dataset=tokenized_dataset,
    args=transformers.TrainingArguments(
        per_device_train_batch_size=4,
        gradient_accumulation_steps=4,
        warmup_steps=100,
        max_steps=100,
        learning_rate=1e-3,
        fp16=True,
        logging_steps=1,
        output_dir='outputs',
    ),
    data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False)
)
model.config.use_cache = True  # silence the warnings. Please re-enable for inference!
trainer.train()

and saved it locally with:

trainer.save_model(cwd+"/finetuned_model")
print("saved trainer locally")

as well as to the hub:

model.push_to_hub("lucas0/empath-llama-7b", create_pr=1)

How can I load my finetuned model?

2

Answers


  1. To load a fine-tuned peft/lora model, take a look at the guanco example, https://stackoverflow.com/a/76372390/610569

    import torch
    from peft import PeftModel    
    from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaTokenizer, StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
    
    model_name = "decapoda-research/llama-7b-hf"
    adapters_name = "lucas0/empath-llama-7b"
    
    print(f"Starting to load the model {model_name} into memory")
    
    m = AutoModelForCausalLM.from_pretrained(
        model_name,
        #load_in_4bit=True,
        torch_dtype=torch.bfloat16,
        device_map={"": 0}
    )
    m = PeftModel.from_pretrained(m, adapters_name)
    m = m.merge_and_unload()
    tok = LlamaTokenizer.from_pretrained(model_name)
    tok.bos_token_id = 1
    
    stop_token_ids = [0]
    
    print(f"Successfully loaded the model {model_name} into memory")
    

    You will need an A10g GPU runtime minimally to load the model properly.


    For more details see

    Login or Signup to reply.
  2. You can load like this after pushing. I did using the following snippet successfully .

    # pip install peft transformers
    import torch
    from peft import PeftModel, PeftConfig
    from transformers import LlamaTokenizer, LlamaForCausalLM
    from accelerate import infer_auto_device_map, init_empty_weights
    
    peft_model_id = "--path--"
    
    config = PeftConfig.from_pretrained(peft_model_id)
    
    model1 = LlamaForCausalLM.from_pretrained(
        config.base_model_name_or_path,
        torch_dtype='auto',
        device_map='auto',
        offload_folder="offload", offload_state_dict = True
    )
    tokenizer = LlamaTokenizer.from_pretrained(config.base_model_name_or_path)
    
    # Load the Lora model
    model1 = PeftModel.from_pretrained(model, peft_model_id)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search