|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
- zh |
|
base_model: |
|
- Qwen/Qwen2.5-14B-Instruct |
|
- Qwen/Qwen2.5-14B-Instruct-1M |
|
- arcee-ai/Virtuoso-Small-v2 |
|
- deepcogito/cogito-v1-preview-qwen-14B |
|
- deepseek-ai/DeepSeek-R1-Distill-Qwen-14B |
|
pipeline_tag: text-generation |
|
tags: |
|
- merge |
|
--- |
|
> *We have used the Karcher merging method to average five of the most representative Qwen2.5 14B derivative models, in commemoration of the efforts made by the open-source community for the Qwen2.5 14B model.* |
|
# *Merge Method* |
|
*This model was merged using the [Karcher Mean](https://github.com/arcee-ai/mergekit/blob/main/docs/merge_methods.md#karcher-mean-karcher) merge method.* |
|
|
|
# *Models Merged* |
|
*The following models were included in the merge:* |
|
|
|
* [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) |
|
* [Qwen/Qwen2.5-14B-Instruct-1M](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct-1M) |
|
* [arcee-ai/Virtuoso-Small-v2](https://huggingface.co/arcee-ai/Virtuoso-Small-v2) |
|
* [deepcogito/cogito-v1-preview-qwen-14B](https://huggingface.co/deepcogito/cogito-v1-preview-qwen-14B) |
|
* [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) |
|
|
|
# *Configuration*: |
|
*The following YAML configuration was used to produce this model:* |
|
|
|
```yaml |
|
models: |
|
- model: Qwen/Qwen2.5-14B-Instruct |
|
- model: Qwen/Qwen2.5-14B-Instruct-1M |
|
- model: arcee-ai/Virtuoso-Small-v2 |
|
- model: deepcogito/cogito-v1-preview-qwen-14B |
|
- model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B |
|
merge_method: karcher |
|
parameters: |
|
max_iter: 1000 |
|
dtype: bfloat16 |
|
tokenizer_source: base |
|
``` |
|
|