YOYO-AI/Qwen2.5-32B-YOYO-MIX

This series aims to unify the official models of Qwen.

The unified model obtained by merging the code model and the instruction model through the SCE method

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: Qwen/Qwen2.5-32B-instruct
    parameters:
      density: 1 
      weight: 1
      lambda: 0.9
merge_method: della
base_model: Qwen/Qwen2.5-32B
parameters:
  density: 1
  weight: 1
  lambda: 0.9
  normalize: true
  int8_mask: true
dtype: bfloat16
name: Qwen2.5-32B-YOYO

models:
  - model: Qwen/Qwen2.5-Coder-32B-instruct
    parameters:
      density: 1 
      weight: 1
      lambda: 0.9
merge_method: della
base_model: Qwen/Qwen2.5-Coder-32B
parameters:
  density: 1
  weight: 1
  lambda: 0.9
  normalize: true
  int8_mask: true
dtype: bfloat16
name: Qwen2.5-Coder-32B-YOYO

merge_method: sce
models:
  # Pivot model
  - model: Qwen/Qwen2.5-Coder-32B
  # Target models
  - model: YOYO-AI/Qwen2.5-32B-YOYO
  - model: YOYO-AI/Qwen2.5-Coder-32B-YOYO
base_model: Qwen/Qwen2.5-Coder-32B
parameters:
  select_topk: 1
dtype: bfloat16
tokenizer_source: base
normalize: true
int8_mask: true

YOYO-AI
/

Qwen2.5-32B-YOYO-MIX

Configuration

Model tree for YOYO-AI/Qwen2.5-32B-YOYO-MIX