TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes Paper • 2503.23461 • Published 4 days ago • 70
Transformers Use Causal World Models in Maze-Solving Tasks Paper • 2412.11867 • Published Dec 16, 2024 • 1
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy Paper • 2503.19757 • Published 9 days ago • 47
MAPS: A Multi-Agent Framework Based on Big Seven Personality and Socratic Guidance for Multimodal Scientific Problem Solving Paper • 2503.16905 • Published 13 days ago • 52
When Less is Enough: Adaptive Token Reduction for Efficient Image Representation Paper • 2503.16660 • Published 14 days ago • 70
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published 10 days ago • 110
Llama Nemotron Collection Open, Production-ready Enterprise Models • 3 items • Updated 8 days ago • 25
Wan2.1 14B 480p I2V LoRAs Collection A collection of Remade's Wan2.1 14B 480p I2V LoRAs • 39 items • Updated 2 days ago • 90
EXAONE-Deep Collection EXAONE reasoning model series of 2.4B, 7.8B, and 32B, optimized for reasoning tasks including math and coding • 9 items • Updated 16 days ago • 84
💫StarVector Models Collection StarVector is a multimodal LLM for Scalable Vector Graphics (SVG) generation, producing structured SVG code directly from images and text. • 2 items • Updated 13 days ago • 88
UIGEN-T1.5 REASONING MODEL Collection UIGEN'S Next Iteration. UIGEN-T1.5 is a midway model between 1 and 2, reflecting our new data collection pipeline changes. • 5 items • Updated 10 days ago • 5
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't Paper • 2503.16219 • Published 14 days ago • 46
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning Paper • 2503.15265 • Published 15 days ago • 44
view article Article NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets 16 days ago • 31