Transformers
English
Inference Endpoints

SmolLM2-135M Intermediate Checkpoints

We are releasing an intermediate checkpoint of SmolLM2 to enable further research on mechanistic interpretability and learning dynamics. This repo contains the checkpoint every 240000 steps which correspond to ~250B tokens.

How to Load a Checkpoint

# pip install transformers
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
checkpoint = "HuggingFaceTB/SmolLM2-135M-intermediate-checkpoints"
revision = "step-240000" # replace by the revision you want
device = torch.device("cuda" if torch.cuda.is_available() else "mps" if hasattr(torch, 'mps') and torch.mps.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained(checkpoint, revision=revision)
model = AutoModelForCausalLM.from_pretrained(checkpoint, revision=revision).to(device)
inputs = tokenizer.encode("Gravity is", return_tensors="pt").to(device)
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))

Training Details

For comprehensive information about SmolLM2 training methodology, please refer to:

Checkpoint Details

The GBS (Global Batch Size) in token of SmolLM2-135M is 1 048 576. To get the number of tokens knowing the step you need to do:

nb_tokens = nb_step * GBS

License

Apache 2.0

Citation

@misc{allal2025smollm2smolgoesbig,
      title={SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model}, 
      author={Loubna Ben Allal and Anton Lozhkov and Elie Bakouch and Gabriel Martín Blázquez and Guilherme Penedo and Lewis Tunstall and Andrés Marafioti and Hynek Kydlíček and Agustín Piqueres Lajarín and Vaibhav Srivastav and Joshua Lochner and Caleb Fahlgren and Xuan-Son Nguyen and Clémentine Fourrier and Ben Burtenshaw and Hugo Larcher and Haojun Zhao and Cyril Zakka and Mathieu Morlon and Colin Raffel and Leandro von Werra and Thomas Wolf},
      year={2025},
      eprint={2502.02737},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.02737}, 
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Collection including HuggingFaceTB/SmolLM2-135M-intermediate-checkpoints