Mid-training Analysis Checkpoints (Llama-3.2-3B)
Collection
What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training.
•
10 items
•
Updated
•
1