ProgramBench: Can Language Models Rebuild Programs From Scratch? Paper • 2605.03546 • Published 7 days ago • 2
view article Article Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents nvidia • 13 days ago • 48
LeWM Collection Official checkpoints and datasets related to LeWM paper. • 9 items • Updated Mar 27 • 36
view article Article Building a Fast Multilingual OCR Model with Synthetic Data nvidia • 24 days ago • 32
TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration Paper • 2604.14116 • Published 27 days ago • 13
AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs Paper • 2507.05687 • Published Jul 8, 2025 • 31
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano and Super v3. • 28 items • Updated 3 days ago • 138
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 12 items • Updated 3 days ago • 147
Nemotron v3 Pre-Training Collection Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 3 days ago • 16
Unsloth Dynamic 2.0 Quants Collection New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. • 89 items • Updated 13 days ago • 599
view article Article Train AI models with Unsloth and Hugging Face Jobs for FREE +4 burtenshaw, danielhanchen, shimmyshimmer, mlabonne, davanstrien, evalstate • Feb 20 • 100
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI Paper • 2512.16676 • Published Dec 18, 2025 • 222
R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents Paper • 2504.07164 • Published Apr 9, 2025 • 2
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 itazap, ariG23498, ArthurZ, sergiopaniego, merve, pcuenq • Dec 18, 2025 • 124
Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed Paper • 2512.14067 • Published Dec 16, 2025 • 16
view article Article Supercharge your OCR Pipelines with Open Models +5 merve, ariG23498, davanstrien, hynky, andito, reach-vb, pcuenq • Oct 21, 2025 • 309