ReSearch Collection Trained models as described in the paper "ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning" • 5 items • Updated Mar 27 • 5
DistilBERT release Collection Original DistilBERT model, checkpoints obtained from using teacher-student learning from the original BERT checkpoints. • 6 items • Updated Apr 17, 2024 • 21
The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks Paper • 2504.15521 • Published 6 days ago • 60
THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models Paper • 2504.13367 • Published 10 days ago • 24
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models Paper • 2504.15271 • Published 7 days ago • 63
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published 10 days ago • 111
LettuceDetect: A Hallucination Detection Framework for RAG Applications Paper • 2502.17125 • Published Feb 24 • 11
Hallucination detection Collection Trained ModernBERT (base and large) for detection hallucinations in LLM responses. The models are trained as token classifications. • 4 items • Updated Mar 5 • 16
MAI-DS-R1 Collection MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team. • 2 items • Updated 11 days ago • 9