|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- amphora/QwQ-LongCoT-130K-2 |
|
- PowerInfer/QWQ-LONGCOT-500K |
|
- PowerInfer/LONGCOT-Refine-500K |
|
language: |
|
- en |
|
metrics: |
|
- perplexity |
|
base_model: |
|
- Qwen/Qwen2.5-0.5B-Instruct |
|
library_name: transformers |
|
--- |
|
## Model Details: |
|
|
|
- **Base Model:** Qwen/Qwen2-0.5B-Instruct |
|
- **Teacher Model:** Qwen/QwQ-32B-Preview |
|
- **Distillation Framework:** Instruction Tuning |
|
- **Task Type:** Conversational AI / Causal Language Modeling |
|
- **Parameters:** 0.5B |
|
- **Special Features:** |
|
- Integrated gradient checkpointing for efficient training |
|
- Step-by-step reasoning capabilities for better problem-solving |
|
|
|
--- |
|
|
|
## Training: |
|
|
|
QwQ-0.5B-Distilled was trained using the **amphora/QwQ-LongCoT-130K-2**, **PowerInfer/QWQ-LONGCOT-500K**, and **PowerInfer/LONGCOT-Refine-500K** with supervised finetuning. This model can be used as a competitive reasoning model on edge devices as well as a draft model for Qwen/QwQ-32B-Preview. |
|
### Training Progress: |
|
[ββββββββββ] 100% |
|
|
|
## Example Usage: |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
# Model name |
|
model_name = "kz919/QwQ-0.5B-Distilled-SFT" |
|
# Load the model |
|
print(f"Starting to load the model {model_name} into memory") |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
torch_dtype=torch.bfloat16, |
|
device_map={"": 0} |
|
) |
|
# Load the tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
# Define the prompt |
|
prompt = "How many r in strawberry." |
|
messages = [ |
|
{"role": "system", "content": "You are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-by-step."}, |
|
{"role": "user", "content": prompt} |
|
] |
|
# Tokenize the input |
|
text = tokenizer.apply_chat_template( |
|
messages, |
|
tokenize=False, |
|
add_generation_prompt=True |
|
) |
|
model_inputs = tokenizer([text], return_tensors="pt").to(model.device) |
|
# Generate a response |
|
generated_ids = model.generate( |
|
**model_inputs, |
|
max_new_tokens=4096 |
|
) |
|
generated_ids = [ |
|
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) |
|
] |
|
# Decode the response |
|
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] |
|
print(response) |
|
``` |
|
|
|
--- |
|
|
|
## Applications: |
|
|
|
1. **Conversational Assistants:** |
|
Suitable for AI chatbots that require reasoning and long-context understanding. |
|
|
|
2. **Educational Tools:** |
|
Provides step-by-step explanations, making it ideal for learning environments. |
|
|
|
3. **Creative Writing:** |
|
Assists in generating coherent, contextually aware long-form content. |
|
|
|
4. **Technical Support:** |
|
Handles complex customer queries with precision and clarity. |
|
|
|
--- |
|
|
|
## Draft model for Qwen/QwQ-32B-Preview: |
|
|
|
This model can be used as a draft model for [Qwen/QwQ-32B-Preview](https://huggingface.co/Qwen/QwQ-32B-Preview) in sepculative decoding. We observe out of 5 tokens it generates, on average 3 tokens are accepted for math queries and 2.3 tokens are accepted for general reasoning queries. |
|
|
|
--- |
|
|
|
## Limitations: |
|
|
|
- While distilled for efficiency, performance on highly complex reasoning tasks may slightly trail the teacher model. |
|
- This model could still be under trained, merely a proof of concept. Don't yell at me if it's outputing nonesense. |
|
--- |
|
|
|
## Citation: |
|
|
|
If you use this model in your research or applications, please cite it as: |
|
|
|
```bibtex |
|
@model{qwq_0.5B_distilled, |
|
author = {Kaizhao Liang}, |
|
title = {Mini-QwQ: A Reasoning Model for Edge Devices}, |
|
year = {2024}, |
|
publisher = {Hugging Face}, |
|
version = {1.0} |
|
} |
|
``` |
|
|
|
--- |
|
|
|
This model is an example of how efficient fine-tuning and distillation methods can deliver robust conversational AI capabilities in a smaller, more manageable footprint. |