Recommended prompt form with apply_chat_template
I don't understand the difference or the best way to build a prompt.
Is there any difference between using "apply_chat_template" and then calling "client.text_generation"
or simply passing the prompts and using "client.chat.completions":
from huggingface_hub import InferenceClient
from transformers import AutoTokenizer
model = "mistralai/Mixtral-8x7B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model)
chat = [
{"role": "user", "content": "Hello, what is your name?"},
{
"role": "assistant",
"content": "Hello, I am an AI model, how can I help you?",
},
{"role": "user", "content": "I am learning Python, can you give me a tip!"},
]
template = tokenizer.apply_chat_template(
chat, tokenizer=True, add_generation_prompt=True, return_tensors="pt"
)
Using chat template
client = InferenceClient()
response = client.text_generation(
model=model, prompt=tokenizer.decode(template[0]), max_new_tokens=512
)
Without using chat template
access_token = "xxxxxxxxx"
client = InferenceClient(
provider="nebius",
api_key=access_token,
)
completion = client.chat.completions.create(
model=model,
messages=chat,
)