bosonai/higgs-audio-v2-generation-3B-base Text-to-Speech • 6B • Updated 3 days ago • 140k • 475
The Invisible Leash: Why RLVR May Not Escape Its Origin Paper • 2507.14843 • Published 12 days ago • 81
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? Paper • 2506.11928 • Published Jun 13 • 24
EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge Paper • 2505.23009 • Published May 29 • 18
EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge Paper • 2505.23009 • Published May 29 • 18
EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge Paper • 2505.23009 • Published May 29 • 18 • 2
Pitfalls of Rule- and Model-based Verifiers -- A Case Study on Mathematical Reasoning Paper • 2505.22203 • Published May 28 • 6
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing Paper • 2505.21600 • Published May 27 • 71
SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning Paper • 2504.09081 • Published Apr 12 • 17
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs Paper • 2504.11536 • Published Apr 15 • 61
Heimdall: test-time scaling on the generative verification Paper • 2504.10337 • Published Apr 14 • 33
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations Paper • 2504.10481 • Published Apr 14 • 84
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents Paper • 2407.18901 • Published Jul 26, 2024 • 35
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents Paper • 2407.18901 • Published Jul 26, 2024 • 35