Nexus-Gen: A Unified Model for Image Understanding, Generation, and Editing Paper • 2504.21356 • Published Apr 30 • 1
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment Paper • 2507.20984 • Published 11 days ago • 51
A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence Paper • 2507.21046 • Published 11 days ago • 75
SpatialLM: Training Large Language Models for Structured Indoor Modeling Paper • 2506.07491 • Published Jun 9 • 42
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling Paper • 2506.20512 • Published Jun 25 • 46
StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling Paper • 2507.05240 • Published Jul 7 • 45
Machine Bullshit: Characterizing the Emergent Disregard for Truth in Large Language Models Paper • 2507.07484 • Published 30 days ago • 17
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation Paper • 2507.10524 • Published 25 days ago • 65
Replacing thinking with tool usage enables reasoning in small language models Paper • 2507.05065 • Published Jul 7 • 15
MindJourney: Test-Time Scaling with World Models for Spatial Reasoning Paper • 2507.12508 • Published 23 days ago • 26
Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation Paper • 2504.17207 • Published Apr 24 • 29
Step1X-Edit: A Practical Framework for General Image Editing Paper • 2504.17761 • Published Apr 24 • 93
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated 18 days ago • 522
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 628