YOYO-AI commited on
Commit
13d1691
·
verified ·
1 Parent(s): 570da39

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -20
README.md CHANGED
@@ -1,30 +1,96 @@
1
  ---
2
  base_model:
3
- - YOYO-AI/Qwen2.5-32B-YOYO-karcher-base
4
- - YOYO-AI/Qwen2.5-32B-YOYO-karcher
 
 
 
 
 
 
 
 
 
 
 
 
5
  library_name: transformers
6
  tags:
7
  - mergekit
8
  - merge
9
-
 
 
 
 
10
  ---
11
- # merge
12
-
13
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
14
-
15
- ## Merge Details
16
- ### Merge Method
17
-
18
- This model was merged using the [DELLA](https://arxiv.org/abs/2406.11617) merge method using [YOYO-AI/Qwen2.5-32B-YOYO-karcher-base](https://huggingface.co/YOYO-AI/Qwen2.5-32B-YOYO-karcher-base) as a base.
19
 
20
- ### Models Merged
21
-
22
- The following models were included in the merge:
23
- * [YOYO-AI/Qwen2.5-32B-YOYO-karcher](https://huggingface.co/YOYO-AI/Qwen2.5-32B-YOYO-karcher)
24
-
25
- ### Configuration
26
-
27
- The following YAML configuration was used to produce this model:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
  ```yaml
30
  models:
@@ -43,4 +109,5 @@ parameters:
43
  int8_mask: true
44
  dtype: bfloat16
45
  tokenizer_source: base
46
- ```
 
 
1
  ---
2
  base_model:
3
+ - Qwen/Qwen2.5-Coder-32B
4
+ - Qwen/Qwen2.5-Coder-32B-Instruct
5
+ - tanliboy/lambda-qwen2.5-32b-dpo-test
6
+ - deepcogito/cogito-v1-preview-qwen-32B
7
+ - Qwen/Qwen2.5-32B-Instruct
8
+ - Qwen/QwQ-32B
9
+ - fblgit/TheBeagle-v2beta-32B-MGS
10
+ - Skywork/Skywork-OR1-32B-Preview
11
+ - qihoo360/Light-R1-32B
12
+ - AXCXEPT/EZO-Qwen2.5-32B-Instruct
13
+ - Qwen/Qwen2.5-32B
14
+ - EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2
15
+ - arcee-ai/Virtuoso-Medium-v2
16
+ - Azure99/Blossom-V6-32B
17
  library_name: transformers
18
  tags:
19
  - mergekit
20
  - merge
21
+ license: apache-2.0
22
+ language:
23
+ - en
24
+ - zh
25
+ pipeline_tag: text-generation
26
  ---
27
+ # Qwen2.5-32B-YOYO-V2
28
+ *The YOYO Second Generation **32B** Model is Released!*
29
+ ***Highlights:***
30
+ *1. Using the **Karcher** merging method.*
31
+ *2. Integrating **high-performance 32B models** from the open-source community.*
 
 
 
32
 
33
+ ## First stage:
34
+ *Make a code model:*
35
+ ```yaml
36
+ models:
37
+ - model: Qwen/Qwen2.5-Coder-32B-instruct
38
+ parameters:
39
+ density: 1
40
+ weight: 1
41
+ lambda: 0.9
42
+ merge_method: della
43
+ base_model: Qwen/Qwen2.5-Coder-32B
44
+ parameters:
45
+ density: 1
46
+ weight: 1
47
+ lambda: 0.9
48
+ normalize: true
49
+ int8_mask: true
50
+ dtype: bfloat16
51
+ name: YOYO-AI/Qwen2.5-Coder-32B-YOYO
52
+ ```
53
+ ## Second stage:
54
+ *Make an instruction model:*
55
+ ```yaml
56
+ models:
57
+ - model: YOYO-AI/Qwen2.5-Coder-32B-YOYO
58
+ - model: Qwen/QwQ-32B
59
+ - model: Skywork/Skywork-OR1-32B-Preview
60
+ - model: deepcogito/cogito-v1-preview-qwen-32B
61
+ - model: qihoo360/Light-R1-32B
62
+ - model: AXCXEPT/EZO-Qwen2.5-32B-Instruct
63
+ - model: fblgit/TheBeagle-v2beta-32B-MGS
64
+ - model: tanliboy/lambda-qwen2.5-32b-dpo-test
65
+ - model: Qwen/Qwen2.5-32B-Instruct
66
+ merge_method: karcher
67
+ base_model: Qwen/Qwen2.5-32B-Instruct
68
+ parameters:
69
+ max_iter: 1000
70
+ normalize: true
71
+ int8_mask: true
72
+ tokenizer_source: base
73
+ dtype: float16
74
+ name: YOYO-AI/Qwen2.5-32B-YOYO-karcher
75
+ ```
76
+ ## Third stage:
77
+ *Make a base model:*
78
+ ```yaml
79
+ models:
80
+ - model: EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2
81
+ - model: Azure99/Blossom-V6-32B
82
+ - model: arcee-ai/Virtuoso-Medium-v2
83
+ merge_method: karcher
84
+ base_model: Qwen/Qwen2.5-32B
85
+ parameters:
86
+ max_iter: 1000
87
+ normalize: true
88
+ int8_mask: true
89
+ tokenizer_source: base
90
+ dtype: float16
91
+ name: YOYO-AI/Qwen2.5-32B-YOYO-karcher-base
92
+ ```
93
+ ## Final stage:
94
 
95
  ```yaml
96
  models:
 
109
  int8_mask: true
110
  dtype: bfloat16
111
  tokenizer_source: base
112
+ name: YOYO-AI/Qwen2.5-32B-YOYO-V2
113
+ ```