MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm Paper • 2506.05218 • Published Jun 5 • 2
One-RL-to-See-Them-All Collection https://github.com/MiniMax-AI/One-RL-to-See-Them-All • 5 items • Updated May 26 • 14
Voila Collection Voila: Voice-Language Foundation Models. https://voila.maitrix.org • 7 items • Updated May 6 • 23
nvidia/parakeet-tdt-0.6b-v2 Automatic Speech Recognition • 0.6B • Updated Jun 26 • 713k • 1.25k