SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond Paper • 2505.19641 • Published 6 days ago • 58
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published 4 days ago • 101
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models Paper • 2505.14810 • Published 12 days ago • 58
OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning Paper • 2505.08617 • Published 19 days ago • 40
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond Paper • 2503.21614 • Published Mar 27 • 39
view article Article Finally, a Replacement for BERT: Introducing ModernBERT By bclavie and 14 others • Dec 19, 2024 • 641
LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid Paper • 2502.07563 • Published Feb 11 • 24
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Paper • 2501.12895 • Published Jan 22 • 63
NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models Paper • 2410.11805 • Published Oct 15, 2024 • 14
UI Agent Collection a collection of algorithmic agents for user interfaces/interactions, program synthesis, and robotics • 369 items • Updated 2 days ago • 53
CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling Paper • 2409.19291 • Published Sep 28, 2024 • 20
Learning to Refuse: Towards Mitigating Privacy Risks in LLMs Paper • 2407.10058 • Published Jul 14, 2024 • 32
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training Paper • 2406.16554 • Published Jun 24, 2024 • 1
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies Paper • 2404.06395 • Published Apr 9, 2024 • 23
BitNet: Scaling 1-bit Transformers for Large Language Models Paper • 2310.11453 • Published Oct 17, 2023 • 103
OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch Paper • 2309.10706 • Published Sep 19, 2023 • 17