YOYO-AI's picture
Update README.md
4f6ca06 verified
---
license: apache-2.0
language:
- en
- zh
base_model:
- Qwen/Qwen2.5-14B-Instruct
- Qwen/Qwen2.5-14B-Instruct-1M
- arcee-ai/Virtuoso-Small-v2
- deepcogito/cogito-v1-preview-qwen-14B
- deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
pipeline_tag: text-generation
tags:
- merge
---
> *We have used the Karcher merging method to average five of the most representative Qwen2.5 14B derivative models, in commemoration of the efforts made by the open-source community for the Qwen2.5 14B model.*
# *Merge Method*
*This model was merged using the [Karcher Mean](https://github.com/arcee-ai/mergekit/blob/main/docs/merge_methods.md#karcher-mean-karcher) merge method.*
# *Models Merged*
*The following models were included in the merge:*
* [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
* [Qwen/Qwen2.5-14B-Instruct-1M](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct-1M)
* [arcee-ai/Virtuoso-Small-v2](https://huggingface.co/arcee-ai/Virtuoso-Small-v2)
* [deepcogito/cogito-v1-preview-qwen-14B](https://huggingface.co/deepcogito/cogito-v1-preview-qwen-14B)
* [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)
# *Configuration*:
*The following YAML configuration was used to produce this model:*
```yaml
models:
- model: Qwen/Qwen2.5-14B-Instruct
- model: Qwen/Qwen2.5-14B-Instruct-1M
- model: arcee-ai/Virtuoso-Small-v2
- model: deepcogito/cogito-v1-preview-qwen-14B
- model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
merge_method: karcher
parameters:
max_iter: 1000
dtype: bfloat16
tokenizer_source: base
```