Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published Nov 9, 2025 • 132
view article Article Llasa Goes RL: Training LLaSA with GRPO for Improved Prosody and Expressiveness Nov 5, 2025 • 10
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 Oct 23, 2025 • 138
TTS Collection Collection of some of the TTS models i found cool • 6 items • Updated Oct 10, 2025 • 1
EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs Paper • 2509.09174 • Published Sep 11, 2025 • 61
MedGemma Release Collection Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 7 items • Updated Jul 11, 2025 • 369
Orpheus Multilingual Research Release Collection Beta Release of multilingual models. • 12 items • Updated Apr 10, 2025 • 108
YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper • 2503.08638 • Published Mar 11, 2025 • 71
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference Paper • 2502.18137 • Published Feb 25, 2025 • 59
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper • 2412.15322 • Published Dec 19, 2024 • 20
BhasaAnuvaad Collection A Speech Translation Dataset for 13 Indian Languages • 11 items • Updated Jan 16, 2025 • 25