Model Card for qwen2.5-0.5B-Instruct-pruned-Inshort

pruned(model=qwen2.5-0.5B-Instruct-Inshort) = qwen2.5-0.5B-Instruct-pruned-Inshort

Model Details

Model Description

The model qwen2.5-0.5B-Instruct-pruned-Inshort is a pruned version of qwen2.5-0.5B-Instruct-Inshort.


NOTE

This model is part of my project, where I explore pruning a capable teacher model and recovering its performance through distillation (specifically, behavior cloning) and supervised fine-tuning (SFT), focused on an Inshorts-style summarization task.


This model will act as a pruned model.

๐Ÿง  Model Configuration: Normal vs. Pruned

๐Ÿ”ง Component ๐ŸŸฉ Normal Model ๐ŸŸฆ Pruned Model
Decoder Layers 24 11
MLP Intermediate Size 4864 4096

Pruning details

This work utilizes a hybrid pruning strategy that integrates both width and depth pruning, based on methodologies outlined in LLM Pruning and Distillation in Practice: The Minitron Approach and Compact Language Models via Pruning and Knowledge Distillation.

Check out the Colab Notebook for the code.

Downloads last month
4
Safetensors
Model size
277M params
Tensor type
FP16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support