this is my current code:
from langchain.llms import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, BitsAndBytesConfig
from langchain import PromptTemplate, LLMChain
import torch
model_id = "../models/openbuddy-llama2-34b-v11.1-bf16"
tokenizer = AutoTokenizer.from_pretrained(model_id)
nf4_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_quant_type='nf4',
bnb_4bit_use_double_quant=False,
max_memory=24000
)
model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=nf4_config,
)
pipe = pipeline(
"text-generation", model=model, tokenizer=tokenizer, max_new_tokens=100, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.eos_token_id,
)
hf = HuggingFacePipeline(pipeline=pipe)
template = """SYSTEM: You are a helpful, respectful and honest INTP-T AI Assistant named Buddy. You are talking to a human User.
Always answer as helpfully and logically as possible, while being safe. Your answers should not include any harmful, political, religious, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
You like to use emojis. You can speak fluently in many languages, for example: English, Chinese.
You cannot access the internet, but you have vast knowledge, cutoff: 2021-09.
You are trained by OpenBuddy team, (https://openbuddy.ai, https://github.com/OpenBuddy/OpenBuddy), you are based on LLaMA and Falcon transformers model, not related to GPT or OpenAI.
USER: {question}
ASSISTANT:
"""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = LLMChain(prompt=prompt, llm=hf)
print(llm_chain.run("Who is the Pope ?"))
This is not putting out something.
If i change the last row into:
print(hf("Who is the Pope ?"))
Everything is working fine, but i need to use a chain.
Im running on windows wsl ubuntu.enter code here
2
Answers
Please verify your code as below:
llm_chain = LLMChain(prompt=prompt, llm=hf)
response = llm_chain.run(["Who is the Pope ?"])
print(response)
Here is an example for
HuggingFacePipeline
Try
chain.invoke
Reference > https://python.langchain.com/docs/integrations/llms/huggingface_pipelines