Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 3 items • Updated 10 days ago • 77
Open-RS Collection Model weights & datasets in the paper "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn’t" • 8 items • Updated 16 days ago • 11
JARVIS-VLA-v1 Collection Vision-Language-Action Models in Minecraft. • 4 items • Updated 14 days ago • 9
DeTikZify Collection Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ • 12 items • Updated 17 days ago • 21
💫StarVector Models Collection StarVector is a multimodal LLM for Scalable Vector Graphics (SVG) generation, producing structured SVG code directly from images and text. • 2 items • Updated 16 days ago • 90
Cosmos Transfer1 Collection Multimodal Conditional World Generation for World2World Transfer • 5 items • Updated 2 days ago • 14
EXAONE-Deep Collection EXAONE reasoning model series of 2.4B, 7.8B, and 32B, optimized for reasoning tasks including math and coding • 9 items • Updated 19 days ago • 85
Wan2.1 14B 480p I2V LoRAs Collection A collection of Remade's Wan2.1 14B 480p I2V LoRAs • 39 items • Updated 5 days ago • 95
Forgetting Transformer: Softmax Attention with a Forget Gate Paper • 2503.02130 • Published Mar 3 • 29
Light-R1 Collection Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond • 7 items • Updated 23 days ago • 11
view article Article A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality Mar 4 • 72
C4AI Aya Vision Collection Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated Mar 4 • 68
EgoLife Collection CVPR 2025 - EgoLife: Towards Egocentric Life Assistant. Homepage: https://egolife-ai.github.io/ • 10 items • Updated 30 days ago • 16