|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- knoveleng/open-s1 |
|
base_model: |
|
- deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B |
|
library_name: peft |
|
language: |
|
- en |
|
- zh |
|
pipeline_tag: question-answering |
|
tags: |
|
- reasoning |
|
--- |
|
|
|
## Introduction |
|
|
|
Tina (Tiny Reasoning Models via LoRA) models are all fine-tuned adapters on the base model [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B). |
|
This LoRA adapter in this repo is fine-tuned with the dataset [knoveleng/open-s1](https://huggingface.co/datasets/knoveleng/open-s1). |
|
Please refer to our paper [Tina: Tiny Reasoning Models via LoRA](https://arxiv.org/abs/2504.15777) for more training details. |
|
|
|
|
|
## Example Usage |
|
|
|
The Tina model is meant to be used in combination with the base model as a standard adapter. Particularly, we release all checkpoints we have for each Tina model and one could select different checkpoint to use by specifying the `subfolder`. |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
from peft import PeftModel |
|
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
"deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B", |
|
device_map="auto" |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained( |
|
"deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B" |
|
) |
|
|
|
model = PeftModel.from_pretrained( |
|
base_model, |
|
"Tina-Yi/R1-Distill-Qwen-1.5B-Open-RS1", |
|
subfolder="checkpoint-800" # checkpoint 800 is the best |
|
) |
|
``` |