
OctoThinker/OctoThinker-3B-Long-Base
Text Generation
•
3B
•
Updated
•
6.21k
What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training.