image/jpeg This series aims to unify the official models of Qwen.

The unified model obtained by merging the code model and the instruction model through the SCE method

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: Qwen/Qwen2.5-32B-instruct
    parameters:
      density: 1 
      weight: 1
      lambda: 0.9
merge_method: della
base_model: Qwen/Qwen2.5-32B
parameters:
  density: 1
  weight: 1
  lambda: 0.9
  normalize: true
  int8_mask: true
dtype: bfloat16
name: Qwen2.5-32B-YOYO
models:
  - model: Qwen/Qwen2.5-Coder-32B-instruct
    parameters:
      density: 1 
      weight: 1
      lambda: 0.9
merge_method: della
base_model: Qwen/Qwen2.5-Coder-32B
parameters:
  density: 1
  weight: 1
  lambda: 0.9
  normalize: true
  int8_mask: true
dtype: bfloat16
name: Qwen2.5-Coder-32B-YOYO
merge_method: sce
models:
  # Pivot model
  - model: Qwen/Qwen2.5-Coder-32B
  # Target models
  - model: YOYO-AI/Qwen2.5-32B-YOYO
  - model: YOYO-AI/Qwen2.5-Coder-32B-YOYO
base_model: Qwen/Qwen2.5-Coder-32B
parameters:
  select_topk: 1
dtype: bfloat16
tokenizer_source: base
normalize: true
int8_mask: true
Downloads last month
24
Safetensors
Model size
32.8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for YOYO-AI/Qwen2.5-32B-YOYO-MIX