prithivMLmods commited on
Commit
accd468
·
verified ·
1 Parent(s): d405de6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -17,6 +17,8 @@ tags:
17
  - code
18
  ---
19
 
 
 
20
  # Crux-Qwen3\_OpenThinking-4B
21
 
22
  > **Crux-Qwen3\_OpenThinking-4B** is fine-tuned on the **Qwen3-4B** architecture, optimized for advanced **open thinking**, **mathematical reasoning**, and **logical problem solving**. This model is trained on the traces of **sk1.1**, which include 1,000 entries from the **Gemini thinking trajectory**, combined with fine-tuning on 100k tokens of **open math reasoning** data. This makes it highly effective for nuanced reasoning, educational tasks, and complex problem-solving requiring clear thought processes.
@@ -106,4 +108,4 @@ print(response)
106
 
107
  ## References
108
 
109
- 1. [YaRN: Efficient Context Window Extension of Large Language Models](https://arxiv.org/pdf/2309.00071)
 
17
  - code
18
  ---
19
 
20
+ ![zdfbdccf.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/4XCMQEsE0mv2s5rx-YdIK.png)
21
+
22
  # Crux-Qwen3\_OpenThinking-4B
23
 
24
  > **Crux-Qwen3\_OpenThinking-4B** is fine-tuned on the **Qwen3-4B** architecture, optimized for advanced **open thinking**, **mathematical reasoning**, and **logical problem solving**. This model is trained on the traces of **sk1.1**, which include 1,000 entries from the **Gemini thinking trajectory**, combined with fine-tuning on 100k tokens of **open math reasoning** data. This makes it highly effective for nuanced reasoning, educational tasks, and complex problem-solving requiring clear thought processes.
 
108
 
109
  ## References
110
 
111
+ 1. [YaRN: Efficient Context Window Extension of Large Language Models](https://arxiv.org/pdf/2309.00071)