Update README.md
Browse files
README.md
CHANGED
@@ -47,15 +47,14 @@ library_name: transformers
|
|
47 |
</p>
|
48 |
|
49 |
|
50 |
-
## 1.
|
51 |
|
52 |
We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1.
|
53 |
DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning.
|
54 |
With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors.
|
55 |
However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance,
|
56 |
we introduce DeepSeek-R1, which incorporates cold-start data before RL.
|
57 |
-
|
58 |
-
To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.
|
59 |
|
60 |
**NOTE: Before running DeepSeek-R1 series models locally, we kindly recommend reviewing the [Usage Recommendation](#usage-recommendations) section.**
|
61 |
|
|
|
47 |
</p>
|
48 |
|
49 |
|
50 |
+
## 1. By DyllBoy
|
51 |
|
52 |
We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1.
|
53 |
DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning.
|
54 |
With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors.
|
55 |
However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance,
|
56 |
we introduce DeepSeek-R1, which incorporates cold-start data before RL.
|
57 |
+
salam hangat Dyll
|
|
|
58 |
|
59 |
**NOTE: Before running DeepSeek-R1 series models locally, we kindly recommend reviewing the [Usage Recommendation](#usage-recommendations) section.**
|
60 |
|