This series aims to unify the official models of Qwen.
The unified model obtained by merging the code model and the instruction model through the SCE method
Configuration
The following YAML configuration was used to produce this model:
models:
- model: Qwen/Qwen2.5-32B-instruct
parameters:
density: 1
weight: 1
lambda: 0.9
merge_method: della
base_model: Qwen/Qwen2.5-32B
parameters:
density: 1
weight: 1
lambda: 0.9
normalize: true
int8_mask: true
dtype: bfloat16
name: Qwen2.5-32B-YOYO
models:
- model: Qwen/Qwen2.5-Coder-32B-instruct
parameters:
density: 1
weight: 1
lambda: 0.9
merge_method: della
base_model: Qwen/Qwen2.5-Coder-32B
parameters:
density: 1
weight: 1
lambda: 0.9
normalize: true
int8_mask: true
dtype: bfloat16
name: Qwen2.5-Coder-32B-YOYO
merge_method: sce
models:
# Pivot model
- model: Qwen/Qwen2.5-Coder-32B
# Target models
- model: YOYO-AI/Qwen2.5-32B-YOYO
- model: YOYO-AI/Qwen2.5-Coder-32B-YOYO
base_model: Qwen/Qwen2.5-Coder-32B
parameters:
select_topk: 1
dtype: bfloat16
tokenizer_source: base
normalize: true
int8_mask: true
- Downloads last month
- 24
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for YOYO-AI/Qwen2.5-32B-YOYO-MIX
Merge model
this model