ThiloteE commited on
Commit
11c0263
·
verified ·
1 Parent(s): 8ec99ad

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +265 -3
README.md CHANGED
@@ -1,5 +1,29 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
4
 
5
  > [!NOTE]
@@ -14,7 +38,245 @@ license: mit
14
  - Static quants of https://huggingface.co/jpacifico/Chocolatine-3B-Instruct-DPO-Revised at commit [fa3e742](https://huggingface.co/jpacifico/Chocolatine-3B-Instruct-DPO-Revised/commit/fa3e742dd80b3f38127fb62f5fc66eaf468fb95c)
15
  - Quantized by [ThiloteE](https://huggingface.co/ThiloteE) with llama.cpp commit [e09a800](https://github.com/ggerganov/llama.cpp/commit/e09a800f9a9b19c73aa78e03b4c4be8ed988f3e6)
16
 
17
- # Notes
18
-
19
  These quants were created with a customized configuration that have been proven to not cause visible end of string (eos) tokens during inference with [GPT4All](https://www.nomic.ai/gpt4all).
20
- The config.json, generation_config.json and tokenizer_config.json differ from the original configuration as can be found in the original model's repository at the time of creation of these quants.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ base_model: jpacifico/Chocolatine-3B-Instruct-DPO-Revised
4
+ pipeline_tag: text-generation
5
+ inference: false
6
+ model_creator: jpacifico
7
+ model_name: Chocolatine-3B-Instruct-DPO-Revised
8
+ model_type: phi3
9
+ language:
10
+ - fr
11
+ - en
12
+ datasets:
13
+ - jpacifico/french-orca-dpo-pairs-revised
14
+ library_name: transformers
15
+ quantized_by: ThiloteE
16
+ tags:
17
+ - text-generation-inference
18
+ - transformers
19
+ - GGUF
20
+ - GPT4All-community
21
+ - GPT4All
22
+ - conversational
23
+ - french
24
+ - chocolatine
25
+
26
+
27
  ---
28
 
29
  > [!NOTE]
 
38
  - Static quants of https://huggingface.co/jpacifico/Chocolatine-3B-Instruct-DPO-Revised at commit [fa3e742](https://huggingface.co/jpacifico/Chocolatine-3B-Instruct-DPO-Revised/commit/fa3e742dd80b3f38127fb62f5fc66eaf468fb95c)
39
  - Quantized by [ThiloteE](https://huggingface.co/ThiloteE) with llama.cpp commit [e09a800](https://github.com/ggerganov/llama.cpp/commit/e09a800f9a9b19c73aa78e03b4c4be8ed988f3e6)
40
 
 
 
41
  These quants were created with a customized configuration that have been proven to not cause visible end of string (eos) tokens during inference with [GPT4All](https://www.nomic.ai/gpt4all).
42
+ The config.json, generation_config.json and tokenizer_config.json differ from the original configuration as can be found in the original model's repository at the time of creation of these quants.
43
+
44
+
45
+ # Prompt Template (for GPT4All)
46
+
47
+ Example System Prompt:
48
+ ```
49
+ <|im_start|>system
50
+ <|system|>
51
+ Vous trouverez ci-dessous une instruction décrivant une tâche. Rédigez une réponse qui réponde de manière appropriée à la demande.<|end|><|im_end|>
52
+
53
+ ```
54
+
55
+ Chat Template:
56
+ ```
57
+ <|user|>
58
+ %1<|end|>
59
+ <|assistant|>
60
+ %2<|end|>
61
+
62
+ ```
63
+
64
+ # Context Length
65
+
66
+ `4096`
67
+
68
+ Use a lower value during inference, if you do not have enough RAM or VRAM.
69
+
70
+ # Provided Quants
71
+
72
+
73
+ | Link | Type | Size/GB | Notes |
74
+ |:-----|:-----|--------:|:------|
75
+ | [GGUF](https://huggingface.co/GPT4All-Community/Chocolatine-3B-Instruct-DPO-Revised-GGUF/resolve/main/Chocolatine-3B-Instruct-DPO-Revised-Q4_0.gguf?download=true) | Q4_0 | 2.44 | fast, recommended |
76
+
77
+
78
+
79
+
80
+ # About GGUF
81
+
82
+ If you are unsure how to use GGUF files, refer to one of [TheBloke's
83
+ READMEs](https://huggingface.co/TheBloke/DiscoLM_German_7b_v1-GGUF) for
84
+ more details, including on how to concatenate multi-part files.
85
+
86
+ Here is a handy graph by ikawrakow comparing some quant types (lower is better):
87
+
88
+ ![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)
89
+
90
+ And here are Artefact2's thoughts on the matter:
91
+ https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9
92
+
93
+ # Thanks
94
+
95
+ I thank Mradermacher and TheBloke for Inspiration to this model card and their contributions to open source. Also 3Simplex for lots of help along the way.
96
+ Shoutout to the GPT4All and llama.cpp communities :-)
97
+
98
+
99
+ ------
100
+
101
+ <!-- footer end -->
102
+ <!-- original-model-card start -->
103
+
104
+
105
+ ------
106
+ ------
107
+
108
+ # Original Model card:
109
+
110
+ ---
111
+ library_name: transformers
112
+ license: mit
113
+ language:
114
+ - fr
115
+ - en
116
+ tags:
117
+ - french
118
+ - chocolatine
119
+ datasets:
120
+ - jpacifico/french-orca-dpo-pairs-revised
121
+ pipeline_tag: text-generation
122
+ ---
123
+
124
+ ### Chocolatine-3B-Instruct-DPO-Revised
125
+
126
+ DPO fine-tuned of [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) (3.82B params)
127
+ using the [jpacifico/french-orca-dpo-pairs-revised](https://huggingface.co/datasets/jpacifico/french-orca-dpo-pairs-revised) rlhf dataset.
128
+ Training in French also improves the model in English, surpassing the performances of its base model.
129
+ Window context = 4k tokens
130
+
131
+ ### Benchmarks
132
+
133
+ Chocolatine is the best-performing 3B model on the [OpenLLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) (august 2024)
134
+
135
+ ![image/png](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Assets/openllm_choco3b_revised.png?raw=false)
136
+
137
+
138
+ | Metric |Value|
139
+ |-------------------|----:|
140
+ |**Avg.** |**27.63**|
141
+ |IFEval (0-Shot) |56.23|
142
+ |BBH (3-Shot) |37.16|
143
+ |MATH Lvl 5 (4-Shot)|14.5|
144
+ |GPQA (0-shot) |9.62|
145
+ |MuSR (0-shot) |15.1|
146
+ |MMLU-PRO (5-shot) |33.21|
147
+
148
+
149
+ ### MT-Bench-French
150
+
151
+ Chocolatine-3B-Instruct-DPO-Revised is outperforming GPT-3.5-Turbo on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french) by Bofeng Huang,
152
+ used with [multilingual-mt-bench](https://github.com/Peter-Devine/multilingual_mt_bench)
153
+
154
+ ```
155
+ ########## First turn ##########
156
+ score
157
+ model turn
158
+ gpt-3.5-turbo 1 8.1375
159
+ Chocolatine-3B-Instruct-DPO-Revised 1 7.9875
160
+ Daredevil-8B 1 7.8875
161
+ Daredevil-8B-abliterated 1 7.8375
162
+ Chocolatine-3B-Instruct-DPO-v1.0 1 7.6875
163
+ NeuralDaredevil-8B-abliterated 1 7.6250
164
+ Phi-3-mini-4k-instruct 1 7.2125
165
+ Meta-Llama-3-8B-Instruct 1 7.1625
166
+ vigostral-7b-chat 1 6.7875
167
+ Mistral-7B-Instruct-v0.3 1 6.7500
168
+ Mistral-7B-Instruct-v0.2 1 6.2875
169
+ French-Alpaca-7B-Instruct_beta 1 5.6875
170
+ vigogne-2-7b-chat 1 5.6625
171
+ vigogne-2-7b-instruct 1 5.1375
172
+
173
+ ########## Second turn ##########
174
+ score
175
+ model turn
176
+ Chocolatine-3B-Instruct-DPO-Revised 2 7.937500
177
+ gpt-3.5-turbo 2 7.679167
178
+ Chocolatine-3B-Instruct-DPO-v1.0 2 7.612500
179
+ NeuralDaredevil-8B-abliterated 2 7.125000
180
+ Daredevil-8B 2 7.087500
181
+ Daredevil-8B-abliterated 2 6.873418
182
+ Meta-Llama-3-8B-Instruct 2 6.800000
183
+ Mistral-7B-Instruct-v0.2 2 6.512500
184
+ Mistral-7B-Instruct-v0.3 2 6.500000
185
+ Phi-3-mini-4k-instruct 2 6.487500
186
+ vigostral-7b-chat 2 6.162500
187
+ French-Alpaca-7B-Instruct_beta 2 5.487395
188
+ vigogne-2-7b-chat 2 2.775000
189
+ vigogne-2-7b-instruct 2 2.240506
190
+
191
+ ########## Average ##########
192
+ score
193
+ model
194
+ Chocolatine-3B-Instruct-DPO-Revised 7.962500
195
+ gpt-3.5-turbo 7.908333
196
+ Chocolatine-3B-Instruct-DPO-v1.0 7.650000
197
+ Daredevil-8B 7.487500
198
+ NeuralDaredevil-8B-abliterated 7.375000
199
+ Daredevil-8B-abliterated 7.358491
200
+ Meta-Llama-3-8B-Instruct 6.981250
201
+ Phi-3-mini-4k-instruct 6.850000
202
+ Mistral-7B-Instruct-v0.3 6.625000
203
+ vigostral-7b-chat 6.475000
204
+ Mistral-7B-Instruct-v0.2 6.400000
205
+ French-Alpaca-7B-Instruct_beta 5.587866
206
+ vigogne-2-7b-chat 4.218750
207
+ vigogne-2-7b-instruct 3.698113
208
+ ```
209
+
210
+ ### Usage
211
+
212
+ You can run this model using my [Colab notebook](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Chocolatine_3B_inference_test_colab.ipynb)
213
+
214
+ You can also run Chocolatine using the following code:
215
+
216
+ ```python
217
+ import transformers
218
+ from transformers import AutoTokenizer
219
+
220
+ # Format prompt
221
+ message = [
222
+ {"role": "system", "content": "You are a helpful assistant chatbot."},
223
+ {"role": "user", "content": "What is a Large Language Model?"}
224
+ ]
225
+ tokenizer = AutoTokenizer.from_pretrained(new_model)
226
+ prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
227
+
228
+ # Create pipeline
229
+ pipeline = transformers.pipeline(
230
+ "text-generation",
231
+ model=new_model,
232
+ tokenizer=tokenizer
233
+ )
234
+
235
+ # Generate text
236
+ sequences = pipeline(
237
+ prompt,
238
+ do_sample=True,
239
+ temperature=0.7,
240
+ top_p=0.9,
241
+ num_return_sequences=1,
242
+ max_length=200,
243
+ )
244
+ print(sequences[0]['generated_text'])
245
+ ```
246
+
247
+ * **4-bit quantized version** is available here : [jpacifico/Chocolatine-3B-Instruct-DPO-Revised-Q4_K_M-GGUF](https://huggingface.co/jpacifico/Chocolatine-3B-Instruct-DPO-Revised-Q4_K_M-GGUF)
248
+
249
+ * **Ollama**: [jpacifico/chocolatine-3b](https://ollama.com/jpacifico/chocolatine-3b)
250
+
251
+ ```bash
252
+ ollama run jpacifico/chocolatine-3b
253
+ ```
254
+
255
+ Ollama *Modelfile* example :
256
+
257
+ ```bash
258
+ FROM ./chocolatine-3b-instruct-dpo-revised-q4_k_m.gguf
259
+ TEMPLATE """{{ if .System }}<|system|>
260
+ {{ .System }}<|end|>
261
+ {{ end }}{{ if .Prompt }}<|user|>
262
+ {{ .Prompt }}<|end|>
263
+ {{ end }}<|assistant|>
264
+ {{ .Response }}<|end|>
265
+ """
266
+ PARAMETER stop """{"stop": ["<|end|>","<|user|>","<|assistant|>"]}"""
267
+ SYSTEM """You are a friendly assistant called Chocolatine."""
268
+ ```
269
+
270
+ ### Limitations
271
+
272
+ The Chocolatine model is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance.
273
+ It does not have any moderation mechanism.
274
+
275
+ - **Developed by:** Jonathan Pacifico, 2024
276
+ - **Model type:** LLM
277
+ - **Language(s) (NLP):** French, English
278
+ - **License:** MIT
279
+
280
+
281
+ <!-- original-model-card end -->
282
+ <!-- end -->