upup-ashton-wang commited on
Commit
3b20472
verified
1 Parent(s): e88023d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -1
README.md CHANGED
@@ -5,4 +5,40 @@ datasets:
5
  base_model:
6
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
7
  library_name: peft
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  base_model:
6
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
7
  library_name: peft
8
+ language:
9
+ - en
10
+ - zh
11
+ pipeline_tag: question-answering
12
+ tags:
13
+ - reasoning
14
+ ---
15
+
16
+ ## Introduction
17
+
18
+ Tina (Tiny Reasoning Models via LoRA) models are all fine-tuned adapters on the base model [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B).
19
+ This LoRA adapter in this repo is fine-tuned with the dataset [knoveleng/open-s1](https://huggingface.co/datasets/knoveleng/open-s1).
20
+ Please refer to our paper [Tina: Tiny Reasoning Models via LoRA](https://arxiv.org/abs/2504.15777) for more training details.
21
+
22
+
23
+ ## Example Usage
24
+
25
+ The Tina model is meant to be used in combination with the base model as a standard adapter. Particularly, we release all checkpoints we have for each Tina model and one could select different checkpoint to use by specifying the `subfolder`.
26
+
27
+ ```python
28
+ from transformers import AutoModelForCausalLM, AutoTokenizer
29
+ from peft import PeftModel
30
+
31
+ base_model = AutoModelForCausalLM.from_pretrained(
32
+ "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B",
33
+ device_map="auto"
34
+ )
35
+ tokenizer = AutoTokenizer.from_pretrained(
36
+ "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"
37
+ )
38
+
39
+ model = PeftModel.from_pretrained(
40
+ base_model,
41
+ "Tina-Yi/R1-Distill-Qwen-1.5B-Open-RS1",
42
+ subfolder="checkpoint-800" # checkpoint 800 is the best
43
+ )
44
+ ```