GPT OS3 Beta 8B A3B

  • Developed by: qingy2024
  • Base model: AmanPriyanshu/gpt-oss-8.4b-specialized-all-pruned-moe-only-11-experts

GPT OSS Small (OS3) is a project to create usable and intelligent language models based on pruned GPT-OSS-20B variants by AmanPriyanshu. These are post trained with LoRA on the qingy2024/GPT-OS3-Dataset-v1 dataset to revert some of the "brain damage" due to the expert pruning.

Beta, Step 4163, V1 Dataset. Can generate mostly coherent output. Will use as FFT/LoRA base for the v2 dataset. Chi didn't use the right chat template :/

Built with Axolotl

Downloads last month
97
Safetensors
Model size
8.37B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for qingy2024/GPT-OS3-Beta-8B-A3B

Dataset used to train qingy2024/GPT-OS3-Beta-8B-A3B