YOYO-AI commited on
Commit
1ea3f32
·
verified ·
1 Parent(s): 98692a7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -3
README.md CHANGED
@@ -1,3 +1,52 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - Qwen/Qwen3-14B
4
+ - Qwen/Qwen3-14B-Base
5
+ library_name: transformers
6
+ tags:
7
+ - mergekit
8
+ - merge
9
+ license: apache-2.0
10
+ language:
11
+ - en
12
+ - zh
13
+ pipeline_tag: text-generation
14
+ ---
15
+ # Qwen3-14B-YOYO
16
+
17
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
18
+
19
+ ## Merge Details
20
+ ### Merge Method
21
+
22
+ This model was merged using the [DELLA](https://arxiv.org/abs/2406.11617) merge method using [Qwen/Qwen3-14B-Base](https://huggingface.co/Qwen/Qwen3-14B-Base) as a base.
23
+
24
+ ### Models Merged
25
+
26
+ The following models were included in the merge:
27
+ * [Qwen/Qwen3-14B](https://huggingface.co/Qwen/Qwen3-14B)
28
+
29
+ ### Configuration
30
+
31
+ The following YAML configuration was used to produce this model:
32
+
33
+ ```yaml
34
+ models:
35
+ - model: Qwen/Qwen3-14B
36
+ parameters:
37
+ density: 0.5
38
+ weight: 1
39
+ lambda: 0.9
40
+ merge_method: della
41
+ base_model: Qwen/Qwen3-14B-Base
42
+ parameters:
43
+ density: 1
44
+ weight: 1
45
+ lambda: 0.9
46
+ normalize: true
47
+ int8_mask: true
48
+ dtype: bfloat16
49
+ chat_template: "chatml"
50
+ tokenizer_source: Qwen/Qwen3-14B
51
+
52
+ ```