TinyQwex-4x620M-MoE

TinyQwex-4x620M-MoE is a Mixure of Experts (MoE) made with the following models using LazyMergekit:

🌟 Buying me coffee is a direct way to show support for this project.

πŸ’» Usage

!pip install -qU transformers bitsandbytes accelerate eniops

from transformers import AutoTokenizer
import transformers
import torch

model = "Isotonic/TinyQwex-4x620M-MoE"

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen1.5-0.5B")
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={"torch_dtype": torch.bfloat16, "load_in_4bit": True},
)

messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

🧩 Configuration

experts:
  - source_model: Qwen/Qwen1.5-0.5B
    positive_prompts:
    - "reasoning"

  - source_model: Qwen/Qwen1.5-0.5B
    positive_prompts:
    - "program"

  - source_model: Qwen/Qwen1.5-0.5B
    positive_prompts:
    - "storytelling"

  - source_model: Qwen/Qwen1.5-0.5B
    positive_prompts:
    - "Instruction following assistant"
Downloads last month
220
Safetensors
Model size
1.24B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for Isotonic/TinyQwex-4x620M-MoE

Quantizations
2 models

Collection including Isotonic/TinyQwex-4x620M-MoE