File size: 3,966 Bytes
ae346ad 93a1a5a ae346ad 93a1a5a ae346ad 93a1a5a ae346ad 93a1a5a 6eec3b8 93a1a5a 6eec3b8 93a1a5a 6eec3b8 93a1a5a 6eec3b8 93a1a5a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 |
---
license: mit
library_name: transformers
pipeline_tag: text-generation
base_model:
- nvidia/Llama-3.1-Minitron-4B-Depth-Base
datasets:
- BAAI/Infinity-Instruct
---
We fine-tune `nvidia/Llama-3.1-Minitron-4B-Depth-Base` with the LLM-Neo method, which combines LoRA and KD. Training data is sampled from `BAAI/Infinity-Instruct` for 100k lines.
This repository contains the model described in the paper [LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language Models](https://hf.co/papers/2411.06839).
The project page is available [here](https://huggingface.co/collections/yang31210999/llm-neo-66e3c882f5579b829ff57eba) and the Github repository is available [here](https://github.com/yang3121099/LLM-Neo).
## Basic Usage
This example demonstrates generating text using the model. You'll need to install the necessary libraries first: `pip install transformers`.
```python
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
import torch
model_path = "yang31210999/Llama-3.1-Minitron-4B-Depth-Neo-10w"
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, device_map="auto", torch_dtype=torch.bfloat16)
prompt = "Once upon a time"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
generation_config = GenerationConfig(
max_new_tokens=50, do_sample=True, temperature=0.7
)
outputs = model.generate(**inputs, generation_config=generation_config)
generated_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(generated_text)
```
## Benchmarks
In this section, we report the results for `Llama-3.1-Minitron-4B-Depth-Neo-10w` on standard automatic benchmarks. For all the evaluations, we use the [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) library.
### Evaluation results
<table>
<tr>
<td><strong>Category</strong>
</td>
<td><strong>Benchmark</strong>
</td>
<td><strong>Version</strong>
</td>
<td><strong>n-shot</strong>
</td>
<td><strong>Metric</strong>
</td>
<td><strong>Value</strong>
</td>
<td><strong>Stderr</strong>
</td>
</tr>
<tr>
<td rowspan="3" >BBH
</td>
<td>BBH (General)</td>
<td>N/A</td>
<td>3</td>
<td>exact_match</td>
<td>0.4729</td>
<td>± 0.0055</td>
</tr>
<tr>
<td>BBH (Boolean Expressions)</td>
<td>2</td>
<td>3</td>
<td>exact_match</td>
<td>0.8120</td>
<td>± 0.0248</td>
</tr>
<tr>
<td>BBH (Date Understanding)</td>
<td>2</td>
<td>3</td>
<td>exact_match</td>
<td>0.6600</td>
<td>± 0.0300</td>
</tr>
<tr>
<td rowspan="4" >CEVAL
</td>
<td>CEVAL (General)</td>
<td>N/A</td>
<td>0</td>
<td>acc</td>
<td>0.4413</td>
<td>± 0.0135</td>
</tr>
<tr>
<td>CEVAL (Accountant)</td>
<td>1</td>
<td>0</td>
<td>acc</td>
<td>0.3469</td>
<td>± 0.0687</td>
</tr>
<tr>
<td>CEVAL (Advanced Mathematics)</td>
<td>1</td>
<td>0</td>
<td>acc</td>
<td>0.4737</td>
<td>± 0.1177</td>
</tr>
<tr>
<td>CEVAL (Art Studies)</td>
<td>1</td>
<td>0</td>
<td>acc</td>
<td>0.4545</td>
<td>± 0.0880</td>
</tr>
<tr>
<td rowspan="3" >MMLU
</td>
<td>MMLU (General)</td>
<td>N/A</td>
<td>0</td>
<td>acc</td>
<td>0.6048</td>
<td>± 0.0039</td>
</tr>
<tr>
<td>MMLU (Humanities)</td>
<td>N/A</td>
<td>0</td>
<td>acc</td>
<td>0.5552</td>
<td>± 0.0067</td>
</tr>
<tr>
<td>MMLU (STEM)</td>
<td>N/A</td>
<td>0</td>
<td>acc</td>
<td>0.5214</td>
<td>± 0.0086</td>
</tr>
<tr>
<td rowspan="2" >CMMLU
</td>
<td>CMMLU (General)</td>
<td>N/A</td>
<td>0</td>
<td>acc</td>
<td>0.3548</td>
<td>± 0.0044</td>
</tr>
<tr>
<td>CMMLU (Normalized)</td>
<td>N/A</td>
<td>0</td>
<td>acc_norm</td>
<td>0.3548</td>
<td>± 0.0044</td>
</tr>
</table> |