AI & ML interests

None defined yet.

Recent Activity

OctoThinker 's collections 4

Mid-training Analysis Checkpoints (Llama-3.2-3B)
What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training.
OctoThinker-Llama-3B Family
What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training.
OctoThinker-Llama-1B Family
What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training.