Click here to support our open-source dataset and model releases!
This is an early alpha preview of the upcoming Esper 3 series for Qwen 3 - use at your own discretion! The full model is now available, click here!
Esper 3 is a reasoning-chat finetune focused on coding, architecture, DevOps, and general reasoning chat.
All training data generated synthetically by Deepseek-R1 685b model. This sneak preview uses training data from our Titanium, Tachibana, and Raiden series of datasets. Final datasets used will be provided along the full release of Esper 3 - this preview release is only trained on a subselection of the data for early testing. Full model release coming soon!
See the Qwen 3 8b page for sample prompting scripts or further information on the base model. Esper 3 is a reasoning finetune: enable_thinking=True is recommended for all chats.
Try the preview release out, see what you think, tell your friends :)
More Qwen 3 releases to come soon!
Do as you will.
- Downloads last month
- 33