code-example

#7
by pcuenq HF staff - opened
Files changed (1) hide show
  1. README.md +52 -2
README.md CHANGED
@@ -24,8 +24,6 @@ Install `transformers`
24
  pip install transformers accelerate
25
  ```
26
 
27
- **Warning:** The 70B Instruct model has a different prompt template than the smaller versions. We'll update this repo soon.
28
-
29
  Model capabilities:
30
 
31
  - [x] Code completion.
@@ -33,6 +31,58 @@ Model capabilities:
33
  - [x] Instructions / chat.
34
  - [ ] Python specialist.
35
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
  ## Model Details
37
  *Note: Use of this model is governed by the Meta license. Meta developed and publicly released the Code Llama family of large language models (LLMs).
38
 
 
24
  pip install transformers accelerate
25
  ```
26
 
 
 
27
  Model capabilities:
28
 
29
  - [x] Code completion.
 
31
  - [x] Instructions / chat.
32
  - [ ] Python specialist.
33
 
34
+ **Chat use:** The 70B Instruct model uses a different prompt template than the smaller versions. To use it with `transformers`, we recommend you use the built-in chat template:
35
+
36
+ ```py
37
+ from transformers import AutoTokenizer, AutoModelForCausalLM
38
+ import transformers
39
+ import torch
40
+
41
+ model_id = "codellama/CodeLlama-70b-hf"
42
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
43
+ model = AutoModelForCausalLM.from_pretrained(
44
+ model_id,
45
+ torch_dtype=torch.float16
46
+ ).to("cuda")
47
+
48
+ chat = [
49
+ {"role": "system", "content": "You are a helpful and honest code assistant expert in JavaScript. Please, provide all answers to programming questions in JavaScript"},
50
+ {"role": "user", "content": "Write a function that computes the set of sums of all contiguous sublists of a given list."},
51
+ ]
52
+ output = model.generate(input_ids=inputs, max_new_tokens=200)
53
+ output = output[0].to("cpu")
54
+ print(tokenizer.decode(output)
55
+ ```
56
+
57
+ You can also use the model for **text or code completion**. This examples uses transformers' `pipeline` interface:
58
+
59
+ ```py
60
+ from transformers import AutoTokenizer
61
+ import transformers
62
+ import torch
63
+
64
+ model_id = "codellama/CodeLlama-70b-hf"
65
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
66
+ pipeline = transformers.pipeline(
67
+ "text-generation",
68
+ model=model_id,
69
+ torch_dtype=torch.float16,
70
+ device_map="auto",
71
+ )
72
+
73
+ sequences = pipeline(
74
+ 'def fibonacci(',
75
+ do_sample=True,
76
+ temperature=0.2,
77
+ top_p=0.9,
78
+ num_return_sequences=1,
79
+ eos_token_id=tokenizer.eos_token_id,
80
+ max_length=100,
81
+ )
82
+ for seq in sequences:
83
+ print(f"Result: {seq['generated_text']}")
84
+ ```
85
+
86
  ## Model Details
87
  *Note: Use of this model is governed by the Meta license. Meta developed and publicly released the Code Llama family of large language models (LLMs).
88