YOYO-AI commited on
Commit
7039dab
·
verified ·
1 Parent(s): f61dcbf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -3
README.md CHANGED
@@ -1,3 +1,59 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - zh
6
+ base_model:
7
+ - deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
8
+ - AXCXEPT/Qwen3-EZO-8B-beta
9
+ pipeline_tag: text-generation
10
+ tags:
11
+ - merge
12
+ ---
13
+
14
+ # *Model Highlights:*
15
+
16
+ - ***merge method**: `slerp`*
17
+
18
+ - ***Highest precision**: `dtype: float32` + `out_dtype: bfloat16`*
19
+
20
+ - ***Brand-new chat template**: ensures normal operation on LM Studio*
21
+
22
+ - ***Context length**: `131072`*
23
+ ## *Model Selection Table:*
24
+ |Model|Context|Uses Basic Model|
25
+ |---|---|---|
26
+ |[Qwen3-EZO-8B-YOYO-nuslerp](https://huggingface.co/YOYO-AI/Qwen3-EZO-8B-YOYO-nuslerp)|32K|No|
27
+ |[Qwen3-EZO-8B-YOYO-nuslerp-128K](https://huggingface.co/YOYO-AI/Qwen3-EZO-8B-YOYO-nuslerp-128K)|128K|No|
28
+ |[Qwen3-EZO-8B-YOYO-slerp](https://huggingface.co/YOYO-AI/Qwen3-EZO-8B-YOYO-slerp)|32K|Yes|
29
+ |[Qwen3-EZO-8B-YOYO-slerp-128K](https://huggingface.co/YOYO-AI/Qwen3-EZO-8B-YOYO-slerp-128K)|128K|Yes|
30
+ > **Warning**:
31
+ > *Models with `128K` context may have slight quality loss. In most cases, please use the `32K` native context!*
32
+ # *Parameter Settings*:
33
+ ## *Thinking Mode:*
34
+ > [!NOTE]
35
+ > *`Temperature=0.6`, `TopP=0.95`, `TopK=20`,`MinP=0`.*
36
+
37
+ # *Configuration*:
38
+ *The following YAML configuration was used to produce this model:*
39
+
40
+ ```yaml
41
+ slices:
42
+ - sources:
43
+ - model: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
44
+ layer_range: [0, 36]
45
+ - model: AXCXEPT/Qwen3-EZO-8B-beta
46
+ layer_range: [0, 36]
47
+ merge_method: slerp
48
+ base_model: AXCXEPT/Qwen3-EZO-8B-beta
49
+ parameters:
50
+ t:
51
+ - filter: self_attn
52
+ value: [0, 0.5, 0.3, 0.7, 1]
53
+ - filter: mlp
54
+ value: [1, 0.5, 0.7, 0.3, 0]
55
+ - value: 0.5
56
+ tokenizer_source: base
57
+ dtype: float32
58
+ out_dtype: bfloat16
59
+ ```