YOYO-AI commited on
Commit
c7e4e24
·
verified ·
1 Parent(s): bd39e4f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -3
README.md CHANGED
@@ -1,3 +1,50 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - zh
6
+ base_model:
7
+ - deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
8
+ - Qwen/Qwen3-8B
9
+ pipeline_tag: text-generation
10
+ tags:
11
+ - merge
12
+ ---
13
+
14
+ # *Model Highlights:*
15
+
16
+ - ***Optimal merge method**: `nuslerp`*
17
+
18
+ - ***Highest precision**: `dtype: float32` + `out_dtype: bfloat16`*
19
+
20
+ - ***Brand-new chat template**: ensures normal operation on LM Studio*
21
+
22
+ - ***Long context length**: `131072`*
23
+
24
+ # *Parameter Settings*:
25
+ ## *Thinking Mode: (Recommend)*
26
+ > [!NOTE]
27
+ > *`Temperature=0.6`, `TopP=0.95`, `TopK=20`,`MinP=0`.*
28
+ ## *Non-thinking Mode: (Not recommend)*
29
+ *`\no_think` may not work sometimes*
30
+ > [!TIP]
31
+ > *`Temperature=0.7`, `TopP=0.8`, `TopK=20`,`MinP=0`.*
32
+ # *Configuration*:
33
+ *The following YAML configuration was used to produce this model:*
34
+
35
+ ```yaml
36
+ models:
37
+ - model: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
38
+ parameters:
39
+ weight: 1
40
+ - model: Qwen/Qwen3-8B
41
+ parameters:
42
+ weight: 1
43
+ merge_method: nuslerp
44
+ tokenizer_source: Qwen/Qwen3-8B
45
+ parameters:
46
+ normalize: true
47
+ int8_mask: true
48
+ dtype: float32
49
+ out_dtype: bfloat16
50
+ ```