1 10

Pierre Erbacher

erbacher

AI & ML interests

None yet

Recent Activity

published an article 6 days ago

Distribution Matching Prevents Mode Collapse in Training Reasoning Models

updated a model 4 months ago

erbacher/Qwen2.5-Kimina-1.7B-SFT

updated a dataset 5 months ago

erbacher/trl-NuminaMath-LEAN

View all activity

Organizations

published an article 6 days ago

Article

Distribution Matching Prevents Mode Collapse in Training Reasoning Models

6 days ago

•

updated a model 4 months ago

erbacher/Qwen2.5-Kimina-1.7B-SFT

Text Generation • 2B • Updated Nov 10, 2025 • 1

updated a dataset 5 months ago

erbacher/trl-NuminaMath-LEAN

Viewer • Updated Nov 7, 2025 • 9.48k • 9

published a model 5 months ago

erbacher/Qwen2.5-Kimina-1.7B-SFT

Text Generation • 2B • Updated Nov 10, 2025 • 1

liked a Space 5 months ago

The Smol Training Playbook

📚

3.06k

The secrets to building world-class LLMs

published a dataset 6 months ago

erbacher/LeanRank-test

Viewer • Updated Sep 29, 2025 • 6.57k • 8

updated 3 datasets 6 months ago

published 3 datasets 7 months ago

erbacher/LeanRank-corpus

Viewer • Updated Sep 29, 2025 • 249k • 8

erbacher/LeanRank-data

Viewer • Updated Sep 29, 2025 • 2.09M • 55

erbacher/trl-NuminaMath-LEAN

Viewer • Updated Nov 7, 2025 • 9.48k • 9

upvoted a paper 9 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30, 2025 • 51

updated a dataset 9 months ago

erbacher/MATH_TTT

Viewer • Updated Jun 14, 2025 • 12k • 7

updated a model about 1 year ago

erbacher/wiki_categories

Updated Mar 7, 2025

published a model about 1 year ago

erbacher/wiki_categories

Updated Mar 7, 2025

updated a dataset about 1 year ago

erbacher/open-math-instruct-steps

Updated Mar 4, 2025 • 9

published a dataset about 1 year ago

erbacher/open-math-instruct-steps

Updated Mar 4, 2025 • 9

liked a Space about 1 year ago

The Ultra-Scale Playbook

🌌

3.75k

The ultimate guide to training LLM on large GPU Clusters

updated a model about 1 year ago

erbacher/Llama-3.2-Tulu-3-1B-SFT

1B • Updated Jan 2, 2025 • 1

Pierre Erbacher

AI & ML interests

Recent Activity

Organizations

erbacher's activity

Distribution Matching Prevents Mode Collapse in Training Reasoning Models

The Smol Training Playbook

The Ultra-Scale Playbook