dnhkng commited on
Commit
335d409
·
verified ·
1 Parent(s): 5fb7061

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -3
README.md CHANGED
@@ -1,3 +1,49 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ This is a new kind of model optimization. This model is based on Meta's Llama-3 8B Instruct.
6
+
7
+
8
+ #### Usgae with Transformers AutoModelForCausalLM
9
+
10
+ ```python
11
+ from transformers import AutoTokenizer, AutoModelForCausalLM
12
+ import torch
13
+
14
+ model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
15
+
16
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
17
+ model = AutoModelForCausalLM.from_pretrained(
18
+ model_id,
19
+ torch_dtype=torch.bfloat16,
20
+ device_map="auto",
21
+ )
22
+
23
+ messages = [
24
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
25
+ {"role": "user", "content": "Who are you?"},
26
+ ]
27
+
28
+ input_ids = tokenizer.apply_chat_template(
29
+ messages,
30
+ add_generation_prompt=True,
31
+ return_tensors="pt"
32
+ ).to(model.device)
33
+
34
+ terminators = [
35
+ tokenizer.eos_token_id,
36
+ tokenizer.convert_tokens_to_ids("<|eot_id|>")
37
+ ]
38
+
39
+ outputs = model.generate(
40
+ input_ids,
41
+ max_new_tokens=256,
42
+ eos_token_id=terminators,
43
+ do_sample=True,
44
+ temperature=0.6,
45
+ top_p=0.9,
46
+ )
47
+ response = outputs[0][input_ids.shape[-1]:]
48
+ print(tokenizer.decode(response, skip_special_tokens=True))
49
+ ```