Xternn commited on
Commit
89e5da9
·
verified ·
1 Parent(s): 56d4cbb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -3
README.md CHANGED
@@ -47,15 +47,14 @@ library_name: transformers
47
  </p>
48
 
49
 
50
- ## 1. Introduction
51
 
52
  We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1.
53
  DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning.
54
  With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors.
55
  However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance,
56
  we introduce DeepSeek-R1, which incorporates cold-start data before RL.
57
- DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.
58
- To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.
59
 
60
  **NOTE: Before running DeepSeek-R1 series models locally, we kindly recommend reviewing the [Usage Recommendation](#usage-recommendations) section.**
61
 
 
47
  </p>
48
 
49
 
50
+ ## 1. By DyllBoy
51
 
52
  We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1.
53
  DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning.
54
  With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors.
55
  However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance,
56
  we introduce DeepSeek-R1, which incorporates cold-start data before RL.
57
+ salam hangat Dyll
 
58
 
59
  **NOTE: Before running DeepSeek-R1 series models locally, we kindly recommend reviewing the [Usage Recommendation](#usage-recommendations) section.**
60