--- license: apache-2.0 datasets: - knoveleng/open-s1 base_model: - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B library_name: peft language: - en - zh pipeline_tag: question-answering tags: - reasoning --- ## Introduction Tina (Tiny Reasoning Models via LoRA) models are all fine-tuned adapters on the base model [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B). This LoRA adapter in this repo is fine-tuned with the dataset [knoveleng/open-s1](https://huggingface.co/datasets/knoveleng/open-s1). Please refer to our paper [Tina: Tiny Reasoning Models via LoRA](https://arxiv.org/abs/2504.15777) for more training details. ## Example Usage The Tina model is meant to be used in combination with the base model as a standard adapter. Particularly, we release all checkpoints we have for each Tina model and one could select different checkpoint to use by specifying the `subfolder`. ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel base_model = AutoModelForCausalLM.from_pretrained( "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained( "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B" ) model = PeftModel.from_pretrained( base_model, "Tina-Yi/R1-Distill-Qwen-1.5B-Open-RS1", subfolder="checkpoint-800" # checkpoint 800 is the best ) ```