metadata
license: mit
library_name: transformers
pipeline_tag: text-generation
base_model:
- nvidia/Llama-3.1-Minitron-4B-Depth-Base
datasets:
- BAAI/Infinity-Instruct
We fine-tune nvidia/Llama-3.1-Minitron-4B-Depth-Base
with the LLM-Neo method, which combines LoRA and KD. Training data is sampled from BAAI/Infinity-Instruct
for 100k lines.
This repository contains the model described in the paper LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language Models. The project page is available here and the Github repository is available here.
Basic Usage
This example demonstrates generating text using the model. You'll need to install the necessary libraries first: pip install transformers
.
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
import torch
model_path = "yang31210999/Llama-3.1-Minitron-4B-Depth-Neo-10w"
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, device_map="auto", torch_dtype=torch.bfloat16)
prompt = "Once upon a time"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
generation_config = GenerationConfig(
max_new_tokens=50, do_sample=True, temperature=0.7
)
outputs = model.generate(**inputs, generation_config=generation_config)
generated_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(generated_text)
Benchmarks
In this section, we report the results for Llama-3.1-Minitron-4B-Depth-Neo-10w
on standard automatic benchmarks. For all the evaluations, we use the lm-evaluation-harness library.
Evaluation results
Category | Benchmark | Version | n-shot | Metric | Value | Stderr |
BBH | BBH (General) | N/A | 3 | exact_match | 0.4729 | ± 0.0055 |
BBH (Boolean Expressions) | 2 | 3 | exact_match | 0.8120 | ± 0.0248 | |
BBH (Date Understanding) | 2 | 3 | exact_match | 0.6600 | ± 0.0300 | |
CEVAL | CEVAL (General) | N/A | 0 | acc | 0.4413 | ± 0.0135 |
CEVAL (Accountant) | 1 | 0 | acc | 0.3469 | ± 0.0687 | |
CEVAL (Advanced Mathematics) | 1 | 0 | acc | 0.4737 | ± 0.1177 | |
CEVAL (Art Studies) | 1 | 0 | acc | 0.4545 | ± 0.0880 | |
MMLU | MMLU (General) | N/A | 0 | acc | 0.6048 | ± 0.0039 |
MMLU (Humanities) | N/A | 0 | acc | 0.5552 | ± 0.0067 | |
MMLU (STEM) | N/A | 0 | acc | 0.5214 | ± 0.0086 | |
CMMLU | CMMLU (General) | N/A | 0 | acc | 0.3548 | ± 0.0044 |
CMMLU (Normalized) | N/A | 0 | acc_norm | 0.3548 | ± 0.0044 |