base_model: | |
- cutelemonlili/Qwen2.5-1.5B-Instruct_MATH_training_response_Qwen2.5_14B | |
- Xiaojian9992024/Qwen2.5-Ultra-1.5B-25.02-Exp | |
- UWNSL/Qwen2.5-1.5B-Instruct_Short_CoT | |
library_name: transformers | |
tags: | |
- mergekit | |
- merge | |
# merge | |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). | |
## Merge Details | |
### Merge Method | |
This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [Xiaojian9992024/Qwen2.5-Ultra-1.5B-25.02-Exp](https://huggingface.co/Xiaojian9992024/Qwen2.5-Ultra-1.5B-25.02-Exp) as a base. | |
### Models Merged | |
The following models were included in the merge: | |
* [cutelemonlili/Qwen2.5-1.5B-Instruct_MATH_training_response_Qwen2.5_14B](https://huggingface.co/cutelemonlili/Qwen2.5-1.5B-Instruct_MATH_training_response_Qwen2.5_14B) | |
* [UWNSL/Qwen2.5-1.5B-Instruct_Short_CoT](https://huggingface.co/UWNSL/Qwen2.5-1.5B-Instruct_Short_CoT) | |
### Configuration | |
The following YAML configuration was used to produce this model: | |
```yaml | |
models: | |
- model: Xiaojian9992024/Qwen2.5-Ultra-1.5B-25.02-Exp | |
#no parameters necessary for base model | |
- model: UWNSL/Qwen2.5-1.5B-Instruct_Short_CoT | |
parameters: | |
density: 0.5 | |
weight: 0.5 | |
- model: cutelemonlili/Qwen2.5-1.5B-Instruct_MATH_training_response_Qwen2.5_14B | |
parameters: | |
density: 0.5 | |
weight: 0.5 | |
merge_method: ties | |
base_model: Xiaojian9992024/Qwen2.5-Ultra-1.5B-25.02-Exp | |
parameters: | |
normalize: false | |
int8_mask: true | |
dtype: float16 | |
``` | |