File size: 1,604 Bytes
f489a98 8f95426 881c596 18a171a 8f95426 547cb9a 4f6ca06 547cb9a 8f95426 547cb9a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
---
license: apache-2.0
language:
- en
- zh
base_model:
- Qwen/Qwen2.5-14B-Instruct
- Qwen/Qwen2.5-14B-Instruct-1M
- arcee-ai/Virtuoso-Small-v2
- deepcogito/cogito-v1-preview-qwen-14B
- deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
pipeline_tag: text-generation
tags:
- merge
---
> *We have used the Karcher merging method to average five of the most representative Qwen2.5 14B derivative models, in commemoration of the efforts made by the open-source community for the Qwen2.5 14B model.*
# *Merge Method*
*This model was merged using the [Karcher Mean](https://github.com/arcee-ai/mergekit/blob/main/docs/merge_methods.md#karcher-mean-karcher) merge method.*
# *Models Merged*
*The following models were included in the merge:*
* [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
* [Qwen/Qwen2.5-14B-Instruct-1M](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct-1M)
* [arcee-ai/Virtuoso-Small-v2](https://huggingface.co/arcee-ai/Virtuoso-Small-v2)
* [deepcogito/cogito-v1-preview-qwen-14B](https://huggingface.co/deepcogito/cogito-v1-preview-qwen-14B)
* [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)
# *Configuration*:
*The following YAML configuration was used to produce this model:*
```yaml
models:
- model: Qwen/Qwen2.5-14B-Instruct
- model: Qwen/Qwen2.5-14B-Instruct-1M
- model: arcee-ai/Virtuoso-Small-v2
- model: deepcogito/cogito-v1-preview-qwen-14B
- model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
merge_method: karcher
parameters:
max_iter: 1000
dtype: bfloat16
tokenizer_source: base
```
|