Question Answering
GGUF
biology
medical
conversational
aashish1904 commited on
Commit
d0b6dbf
·
verified ·
1 Parent(s): bf045b6

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +306 -0
README.md ADDED
@@ -0,0 +1,306 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ license: gemma
5
+ datasets:
6
+ - FreedomIntelligence/ApolloMoEDataset
7
+ language:
8
+ - ar
9
+ - en
10
+ - zh
11
+ - ko
12
+ - ja
13
+ - mn
14
+ - th
15
+ - vi
16
+ - lo
17
+ - mg
18
+ - de
19
+ - pt
20
+ - es
21
+ - fr
22
+ - ru
23
+ - it
24
+ - hr
25
+ - gl
26
+ - cs
27
+ - co
28
+ - la
29
+ - uk
30
+ - bs
31
+ - bg
32
+ - eo
33
+ - sq
34
+ - da
35
+ - sa
36
+ - 'no'
37
+ - gn
38
+ - sr
39
+ - sk
40
+ - gd
41
+ - lb
42
+ - hi
43
+ - ku
44
+ - mt
45
+ - he
46
+ - ln
47
+ - bm
48
+ - sw
49
+ - ig
50
+ - rw
51
+ - ha
52
+ metrics:
53
+ - accuracy
54
+ base_model:
55
+ - google/gemma-2-9b
56
+ pipeline_tag: question-answering
57
+ tags:
58
+ - biology
59
+ - medical
60
+
61
+ ---
62
+
63
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
64
+
65
+
66
+ # QuantFactory/Apollo2-9B-GGUF
67
+ This is quantized version of [FreedomIntelligence/Apollo2-9B](https://huggingface.co/FreedomIntelligence/Apollo2-9B) created using llama.cpp
68
+
69
+ # Original Model Card
70
+
71
+ # Democratizing Medical LLMs For Much More Languages
72
+
73
+ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish, Arabic, Russian, Japanese, Korean, German, Italian, Portuguese and 38 Minor Languages So far.
74
+
75
+
76
+
77
+ <p align="center">
78
+ 📃 <a href="https://arxiv.org/abs/2410.10626" target="_blank">Paper</a> • 🌐 <a href="" target="_blank">Demo</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEDataset" target="_blank">ApolloMoEDataset</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a> • 🤗 <a href="https://huggingface.co/collections/FreedomIntelligence/apollomoe-and-apollo2-670ddebe3bb1ba1aebabbf2c" target="_blank">Models</a> •🌐 <a href="https://github.com/FreedomIntelligence/Apollo" target="_blank">Apollo</a> • 🌐 <a href="https://github.com/FreedomIntelligence/ApolloMoE" target="_blank">ApolloMoE</a>
79
+ </p>
80
+
81
+
82
+
83
+ ![Apollo](assets/apollo_medium_final.png)
84
+
85
+
86
+ ## 🌈 Update
87
+
88
+ * **[2024.10.15]** ApolloMoE repo is published!🎉
89
+
90
+
91
+ ## Languages Coverage
92
+ 12 Major Languages and 38 Minor Languages
93
+
94
+ <details>
95
+ <summary>Click to view the Languages Coverage</summary>
96
+
97
+ ![ApolloMoE](assets/languages.png)
98
+
99
+ </details>
100
+
101
+
102
+ ## Architecture
103
+
104
+ <details>
105
+ <summary>Click to view the MoE routing image</summary>
106
+
107
+ ![ApolloMoE](assets/hybrid_routing.png)
108
+
109
+ </details>
110
+
111
+ ## Results
112
+
113
+ #### Dense
114
+ 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-0.5B" target="_blank">Apollo2-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-1.5B" target="_blank">Apollo2-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-2B" target="_blank">Apollo2-2B</a>
115
+
116
+ 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-3.8B" target="_blank">Apollo2-3.8B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-7B" target="_blank">Apollo2-7B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-9B" target="_blank">Apollo2-9B</a>
117
+
118
+ <details>
119
+ <summary>Click to view the Dense Models Results</summary>
120
+
121
+ ![ApolloMoE](assets/dense_results.png)
122
+
123
+ </details>
124
+
125
+
126
+ #### Post-MoE
127
+ 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-0.5B" target="_blank">Apollo-MoE-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-1.5B" target="_blank">Apollo-MoE-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-7B" target="_blank">Apollo-MoE-7B</a>
128
+
129
+ <details>
130
+ <summary>Click to view the Post-MoE Models Results</summary>
131
+
132
+ ![ApolloMoE](assets/post_moe_results.png)
133
+
134
+ </details>
135
+
136
+
137
+
138
+
139
+ ## Usage Format
140
+ ##### Apollo2
141
+ - 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|>
142
+ - 2B, 9B: User:{query}\nAssistant:{response}\<eos\>
143
+ - 3.8B: <|user|>\n{query}<|end|><|assisitant|>\n{response}<|end|>
144
+
145
+ ##### Apollo-MoE
146
+ - 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|>
147
+
148
+ ## Dataset & Evaluation
149
+
150
+ - Dataset
151
+ 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEDataset" target="_blank">ApolloMoEDataset</a>
152
+
153
+ <details><summary>Click to expand</summary>
154
+
155
+ ![ApolloMoE](assets/Dataset.png)
156
+
157
+ - [Data category](https://huggingface.co/datasets/FreedomIntelligence/ApolloCorpus/tree/main/train)
158
+
159
+
160
+ </details>
161
+
162
+ - Evaluation
163
+ 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a>
164
+
165
+ <details><summary>Click to expand</summary>
166
+
167
+ - EN:
168
+ - [MedQA-USMLE](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options)
169
+ - [MedMCQA](https://huggingface.co/datasets/medmcqa/viewer/default/test)
170
+ - [PubMedQA](https://huggingface.co/datasets/pubmed_qa): Because the results fluctuated too much, they were not used in the paper.
171
+ - [MMLU-Medical](https://huggingface.co/datasets/cais/mmlu)
172
+ - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine
173
+ - ZH:
174
+ - [MedQA-MCMLE](https://huggingface.co/datasets/bigbio/med_qa/viewer/med_qa_zh_4options_bigbio_qa/test)
175
+ - [CMB-single](https://huggingface.co/datasets/FreedomIntelligence/CMB): Not used in the paper
176
+ - Randomly sample 2,000 multiple-choice questions with single answer.
177
+ - [CMMLU-Medical](https://huggingface.co/datasets/haonan-li/cmmlu)
178
+ - Anatomy, Clinical_knowledge, College_medicine, Genetics, Nutrition, Traditional_chinese_medicine, Virology
179
+ - [CExam](https://github.com/williamliujl/CMExam): Not used in the paper
180
+ - Randomly sample 2,000 multiple-choice questions
181
+
182
+
183
+ - ES: [Head_qa](https://huggingface.co/datasets/head_qa)
184
+ - FR:
185
+ - [Frenchmedmcqa](https://github.com/qanastek/FrenchMedMCQA)
186
+ - [MMLU_FR]
187
+ - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine
188
+ - HI: [MMLU_HI](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Hindi)
189
+ - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine
190
+ - AR: [MMLU_AR](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Arabic)
191
+ - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine
192
+ - JA: [IgakuQA](https://github.com/jungokasai/IgakuQA)
193
+ - KO: [KorMedMCQA](https://huggingface.co/datasets/sean0042/KorMedMCQA)
194
+ - IT:
195
+ - [MedExpQA](https://huggingface.co/datasets/HiTZ/MedExpQA)
196
+ - [MMLU_IT]
197
+ - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine
198
+ - DE: [BioInstructQA](https://huggingface.co/datasets/BioMistral/BioInstructQA): German part
199
+ - PT: [BioInstructQA](https://huggingface.co/datasets/BioMistral/BioInstructQA): Portuguese part
200
+ - RU: [RuMedBench](https://github.com/sb-ai-lab/MedBench)
201
+
202
+
203
+
204
+
205
+ </details>
206
+ ## Model Download and Inference
207
+ We take Apollo-MoE-0.5B as an example
208
+ 1. Login Huggingface
209
+
210
+ ```
211
+ huggingface-cli login --token $HUGGINGFACE_TOKEN
212
+ ```
213
+
214
+ 2. Download model to local dir
215
+
216
+ ```python
217
+ from huggingface_hub import snapshot_download
218
+ import os
219
+
220
+ local_model_dir=os.path.join('/path/to/models/dir','Apollo-MoE-0.5B')
221
+ snapshot_download(repo_id="FreedomIntelligence/Apollo-MoE-0.5B", local_dir=local_model_dir)
222
+ ```
223
+
224
+ 3. Inference Example
225
+
226
+ ```python
227
+ from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
228
+ import os
229
+
230
+ local_model_dir=os.path.join('/path/to/models/dir','Apollo-MoE-0.5B')
231
+
232
+ model=AutoModelForCausalLM.from_pretrained(local_model_dir,trust_remote_code=True)
233
+ tokenizer = AutoTokenizer.from_pretrained(local_model_dir,trust_remote_code=True)
234
+ generation_config = GenerationConfig.from_pretrained(local_model_dir, pad_token_id=tokenizer.pad_token_id, num_return_sequences=1, max_new_tokens=7, min_new_tokens=2, do_sample=False, temperature=1.0, top_k=50, top_p=1.0)
235
+
236
+ inputs = tokenizer('Answer direclty.\nThe capital of Mongolia is Ulaanbaatar.\nThe capital of Iceland is Reykjavik.\nThe capital of Australia is', return_tensors='pt')
237
+ inputs = inputs.to(model.device)
238
+ pred = model.generate(**inputs,generation_config=generation_config)
239
+ print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
240
+ ```
241
+
242
+ ## Results reproduction
243
+ <details><summary>Click to expand</summary>
244
+
245
+
246
+ We take Apollo2-7B or Apollo-MoE-0.5B as example
247
+ 1. Download Dataset for project:
248
+
249
+ ```
250
+ bash 0.download_data.sh 
251
+ ```
252
+
253
+ 2. Prepare test and dev data for specific model:
254
+
255
+
256
+ - Create test data for with special token
257
+
258
+ ```
259
+ bash 1.data_process_test&dev.sh
260
+ ```
261
+
262
+ 3. Prepare train data for specific model (Create tokenized data in advance):
263
+
264
+
265
+ - You can adjust data Training order and Training Epoch in this step
266
+
267
+ ```
268
+ bash 2.data_process_train.sh
269
+ ```
270
+
271
+ 4. Train the model
272
+
273
+
274
+ - If you want to train in Multi Nodes please refer to ./src/sft/training_config/zero_multi.yaml
275
+
276
+
277
+ ```
278
+ bash 3.single_node_train.sh
279
+ ```
280
+
281
+
282
+ 5. Evaluate your model: Generate score for benchmark
283
+
284
+ ```
285
+ bash 4.eval.sh
286
+ ```
287
+
288
+ </details>
289
+
290
+
291
+
292
+ ## Citation
293
+ Please use the following citation if you intend to use our dataset for training or evaluation:
294
+
295
+ ```
296
+ @misc{zheng2024efficientlydemocratizingmedicalllms,
297
+ title={Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts},
298
+ author={Guorui Zheng and Xidong Wang and Juhao Liang and Nuo Chen and Yuping Zheng and Benyou Wang},
299
+ year={2024},
300
+ eprint={2410.10626},
301
+ archivePrefix={arXiv},
302
+ primaryClass={cs.CL},
303
+ url={https://arxiv.org/abs/2410.10626},
304
+ }
305
+ ```
306
+