GPT-OSS MoE - Micro Variant

A Mixture of Experts (MoE) language model optimized for Mac Mini (16GB RAM).

Model Details

  • Variant: Micro
  • Active Parameters: ~40M
  • Total Experts: 4
  • Experts per Token: 1
  • Hidden Size: 384
  • Layers: 8

Usage

import torch
from transformers import AutoTokenizer

# Load model state dict
state_dict = torch.load("pytorch_model.bin", map_location="cpu")

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("gpt2")

# Note: This model requires the custom GPT-OSS MoE architecture
# See: https://github.com/xin-slm/xinSLM_v06_gpt_oss

Training Details

  • Platform: Mac Mini 16GB, Apple Silicon MPS
  • Memory Optimization: Real-time monitoring and cleanup
  • Architecture: GPT-OSS with reduced experts for memory efficiency

Citation

@misc{xinslm-gpt-oss-moe-micro,
    title={GPT-OSS MoE Micro: Memory-Optimized Mixture of Experts},
    author={Xinson Li},
    year={2025},
    url={https://huggingface.co/lixinso/xinslm-gpt-oss-moe-micro}
}
Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support