File size: 1,604 Bytes
f489a98
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8f95426
 
881c596
18a171a
8f95426
 
547cb9a
 
 
 
 
4f6ca06
547cb9a
8f95426
 
547cb9a
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
---
license: apache-2.0
language:
- en
- zh
base_model:
- Qwen/Qwen2.5-14B-Instruct
- Qwen/Qwen2.5-14B-Instruct-1M
- arcee-ai/Virtuoso-Small-v2
- deepcogito/cogito-v1-preview-qwen-14B
- deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
pipeline_tag: text-generation
tags:
- merge
---
> *We have used the Karcher merging method to average five of the most representative Qwen2.5 14B derivative models, in commemoration of the efforts made by the open-source community for the Qwen2.5 14B model.*
# *Merge Method*
*This model was merged using the [Karcher Mean](https://github.com/arcee-ai/mergekit/blob/main/docs/merge_methods.md#karcher-mean-karcher) merge method.*

# *Models Merged*
*The following models were included in the merge:*

* [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
* [Qwen/Qwen2.5-14B-Instruct-1M](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct-1M)
* [arcee-ai/Virtuoso-Small-v2](https://huggingface.co/arcee-ai/Virtuoso-Small-v2)
* [deepcogito/cogito-v1-preview-qwen-14B](https://huggingface.co/deepcogito/cogito-v1-preview-qwen-14B)
* [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)

# *Configuration*:
*The following YAML configuration was used to produce this model:*

```yaml
models:
  - model: Qwen/Qwen2.5-14B-Instruct
  - model: Qwen/Qwen2.5-14B-Instruct-1M
  - model: arcee-ai/Virtuoso-Small-v2
  - model: deepcogito/cogito-v1-preview-qwen-14B
  - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
merge_method: karcher
parameters:
  max_iter: 1000
dtype: bfloat16
tokenizer_source: base
```