Model Card for qwen2.5-0.5B-Instruct-pruned-Inshort
pruned(model=qwen2.5-0.5B-Instruct-Inshort) = qwen2.5-0.5B-Instruct-pruned-Inshort
Model Details
Model Description
The model qwen2.5-0.5B-Instruct-pruned-Inshort is a pruned version of qwen2.5-0.5B-Instruct-Inshort.
NOTE
This model is part of my project, where I explore pruning a capable teacher model and recovering its performance through distillation (specifically, behavior cloning) and supervised fine-tuning (SFT), focused on an Inshorts-style summarization task.
This model will act as a pruned model.
๐ง Model Configuration: Normal vs. Pruned
๐ง Component | ๐ฉ Normal Model | ๐ฆ Pruned Model |
---|---|---|
Decoder Layers | 24 | 11 |
MLP Intermediate Size | 4864 | 4096 |
Pruning details
This work utilizes a hybrid pruning strategy that integrates both width and depth pruning, based on methodologies outlined in LLM Pruning and Distillation in Practice: The Minitron Approach and Compact Language Models via Pruning and Knowledge Distillation.
Check out the Colab Notebook for the code.
- Downloads last month
- 4