YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🧠 LLaMA 3.2 1B (Finetuned via Distillation)

This model is a distilled version of meta-llama/Llama-3.2-3B-Instruct, finetuned to mimic its behavior using meta-llama/Llama-3.2-1B-Instruct as the student. The distillation was performed using a subset of WikiText-2 and prompt-based soft label supervision.

πŸ§ͺ Training Method

We used logit matching (KL divergence loss) between teacher and student models. The prompt "How to learn a new language?" was used as a simple test example, and the full WikiText-2 corpus was used for training.

Key settings:

  • Teacher: LLaMA-3.2-3B-Instruct
  • Student: LLaMA-3.2-1B-Instruct
  • Optimizer: AdamW
  • Loss: KLDivLoss on logits
  • Batch size: 16
  • Max tokens: 256
  • Training steps: ~10k (can vary)

πŸ’Ύ Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("YiChuanH/llama1B-finetuned")
tokenizer = AutoTokenizer.from_pretrained("YiChuanH/llama1B-finetuned")

prompt = "How to learn a new language?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(output[0], skip_special_tokens=True))

πŸ“Š Evaluation

This model aims to preserve the output style and quality of the 3B teacher model with significantly fewer parameters (1B). Qualitatively, responses are more informative and instructive than the original base 1B model.

πŸ“š Dataset

πŸ“Œ Limitations

  • No RLHF or instruction fine-tuning beyond logit distillation
  • Not suitable for safety-critical applications
  • Quality may vary across tasks not seen during distillation

πŸ“„ License

This model is released under the same license as the base LLaMA-3.2 models (likely Meta’s LLAMA license), with distillation code and weights under MIT.

Downloads last month
113
Safetensors
Model size
1.24B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support