EXAONE-4.0 Collection EXAONE unified model series of 1.2B and 32B, integrating non-reasoning and reasoning modes. • 18 items • Updated 3 days ago • 32
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for edge AI and on-device deployment. • 9 items • Updated 6 days ago • 72
ThinkPRM Collection Process Reward Models that Think -- https://arxiv.org/abs/2504.16828 • 8 items • Updated 29 days ago • 3
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • 10 days ago • 548
Energy-Based Transformers are Scalable Learners and Thinkers Paper • 2507.02092 • Published 15 days ago • 52
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy Paper • 2507.01352 • Published 16 days ago • 50
Reward Models Collection Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge • 8 items • Updated 7 days ago • 16
Weaver Collection The models and datasets for Weaver: Shrinking the Generation-Verification Gap with Weak Verifiers • 21 items • Updated 23 days ago • 1
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs Paper • 2503.05139 • Published Mar 7 • 3
MiniCPM4 Collection MiniCPM4: Ultra-Efficient LLMs on End Devices • 22 items • Updated 26 days ago • 68
Common Pile v0.1 Filtered Data Collection An LLM pre-training dataset produced by filtering and deduplicating the raw text collected in the Common Pile v0.1 • 31 items • Updated Jun 6 • 14
Unsloth Dynamic 2.0 Quants Collection New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. • 37 items • Updated 1 day ago • 145
One-RL-to-See-Them-All Collection https://github.com/MiniMax-AI/One-RL-to-See-Them-All • 5 items • Updated May 26 • 14
General-Reasoner: Advancing LLM Reasoning Across All Domains Paper • 2505.14652 • Published May 20 • 23