Recommended prompt form with apply_chat_template

#180
by denisousa - opened

I don't understand the difference or the best way to build a prompt.

Is there any difference between using "apply_chat_template" and then calling "client.text_generation"
or simply passing the prompts and using "client.chat.completions":

from huggingface_hub import InferenceClient
from transformers import AutoTokenizer

model = "mistralai/Mixtral-8x7B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model)

chat = [
{"role": "user", "content": "Hello, what is your name?"},
{
"role": "assistant",
"content": "Hello, I am an AI model, how can I help you?",
},
{"role": "user", "content": "I am learning Python, can you give me a tip!"},
]

template = tokenizer.apply_chat_template(
chat, tokenizer=True, add_generation_prompt=True, return_tensors="pt"
)

Using chat template

client = InferenceClient()
response = client.text_generation(
model=model, prompt=tokenizer.decode(template[0]), max_new_tokens=512
)

Without using chat template

access_token = "xxxxxxxxx"
client = InferenceClient(
provider="nebius",
api_key=access_token,
)

completion = client.chat.completions.create(
model=model,
messages=chat,
)

Sign up or log in to comment