sequelbox's picture
esper 3 is here :)
fbc9520 verified
metadata
library_name: transformers
license: apache-2.0
pipeline_tag: text-generation
base_model:
  - Qwen/Qwen3-8B

Click here to support our open-source dataset and model releases!

This is an early alpha preview of the upcoming Esper 3 series for Qwen 3 - use at your own discretion! The full model is now available, click here!

Esper 3 is a reasoning-chat finetune focused on coding, architecture, DevOps, and general reasoning chat.

All training data generated synthetically by Deepseek-R1 685b model. This sneak preview uses training data from our Titanium, Tachibana, and Raiden series of datasets. Final datasets used will be provided along the full release of Esper 3 - this preview release is only trained on a subselection of the data for early testing. Full model release coming soon!

See the Qwen 3 8b page for sample prompting scripts or further information on the base model. Esper 3 is a reasoning finetune: enable_thinking=True is recommended for all chats.

Try the preview release out, see what you think, tell your friends :)

Please consider supporting our releases if you can. There's still time for a bottom-up AI revolution: the time to make a difference in how this turns out is now!

More Qwen 3 releases to come soon!

Do as you will.