Model Card for Acc Qwen 4B
Acc Qwen 4B is a state of the art accessibility GRPO RL trained model with RM_R1 style Chain of Rubric distsillation of Claude 4 Opus using Gemini 2.5 Flash to Qwen 3 4B over 18 million tokens.
The code for training the model is at https://github.com/Nottlespike/Accessible_Qwen
- Downloads last month
- 25
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support