Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

OctoThinker

community
https://github.com/GAIR-NLP/OctoThinker
GAIR-NLP
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

koalazf99Ā  authored a paper 5 days ago
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
koalazf99Ā  published a model 2 months ago
OctoThinker/OctoThinker-8B-Hybrid-Base
koalazf99Ā  published a model 2 months ago
OctoThinker/OctoThinker-8B-Short-Base
View all activity

Fan Zhou's profile picture Zengzhi Wang's profile picture xuefengli's profile picture
Organization Card
Community About org cards

šŸ™ OctoThinker is led by GAIR

šŸŽÆ Our Goal: To reshape the pre-training trajectory so models scale better under RL.

models 16

OctoThinker/OctoThinker-8B-Hybrid-Base

Updated Apr 24 • 12 • 2

OctoThinker/OctoThinker-8B-Short-Base

Updated Apr 24 • 46

OctoThinker/OctoThinker-8B-Long-Base

Updated Apr 24 • 11

OctoThinker/OctoThinker-3B-Short-Zero

Updated Apr 23 • 8

OctoThinker/OctoThinker-3B-Hybrid-Zero

Updated Apr 23 • 32

OctoThinker/OctoThinker-1B-Long-Zero

Updated Apr 23 • 7

OctoThinker/OctoThinker-1B-Hybrid-Zero

Updated Apr 23 • 7

OctoThinker/OctoThinker-1B-Short-Zero

Updated Apr 23 • 41

OctoThinker/Llama3.2-3B-Zero

Updated Apr 22 • 11

OctoThinker/OctoThinker-3B-Long-Zero

Updated Apr 22 • 8
View 16 models

datasets 0

None public yet
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs