--- license: mit library_name: transformers pipeline_tag: text-generation base_model: - nvidia/Llama-3.1-Minitron-4B-Depth-Base datasets: - BAAI/Infinity-Instruct --- We fine-tune `nvidia/Llama-3.1-Minitron-4B-Depth-Base` with the LLM-Neo method, which combines LoRA and KD. Training data is sampled from `BAAI/Infinity-Instruct` for 100k lines. This repository contains the model described in the paper [LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language Models](https://hf.co/papers/2411.06839). The project page is available [here](https://huggingface.co/collections/yang31210999/llm-neo-66e3c882f5579b829ff57eba) and the Github repository is available [here](https://github.com/yang3121099/LLM-Neo). ## Basic Usage This example demonstrates generating text using the model. You'll need to install the necessary libraries first: `pip install transformers`. ```python from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig import torch model_path = "yang31210999/Llama-3.1-Minitron-4B-Depth-Neo-10w" tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, device_map="auto", torch_dtype=torch.bfloat16) prompt = "Once upon a time" inputs = tokenizer(prompt, return_tensors="pt").to("cuda") generation_config = GenerationConfig( max_new_tokens=50, do_sample=True, temperature=0.7 ) outputs = model.generate(**inputs, generation_config=generation_config) generated_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0] print(generated_text) ``` ## Benchmarks In this section, we report the results for `Llama-3.1-Minitron-4B-Depth-Neo-10w` on standard automatic benchmarks. For all the evaluations, we use the [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) library. ### Evaluation results
Category | Benchmark | Version | n-shot | Metric | Value | Stderr |
BBH | BBH (General) | N/A | 3 | exact_match | 0.4729 | ± 0.0055 |
BBH (Boolean Expressions) | 2 | 3 | exact_match | 0.8120 | ± 0.0248 | |
BBH (Date Understanding) | 2 | 3 | exact_match | 0.6600 | ± 0.0300 | |
CEVAL | CEVAL (General) | N/A | 0 | acc | 0.4413 | ± 0.0135 |
CEVAL (Accountant) | 1 | 0 | acc | 0.3469 | ± 0.0687 | |
CEVAL (Advanced Mathematics) | 1 | 0 | acc | 0.4737 | ± 0.1177 | |
CEVAL (Art Studies) | 1 | 0 | acc | 0.4545 | ± 0.0880 | |
MMLU | MMLU (General) | N/A | 0 | acc | 0.6048 | ± 0.0039 |
MMLU (Humanities) | N/A | 0 | acc | 0.5552 | ± 0.0067 | |
MMLU (STEM) | N/A | 0 | acc | 0.5214 | ± 0.0086 | |
CMMLU | CMMLU (General) | N/A | 0 | acc | 0.3548 | ± 0.0044 |
CMMLU (Normalized) | N/A | 0 | acc_norm | 0.3548 | ± 0.0044 |