LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published Jan 10 • 66
Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers Paper • 2501.02393 • Published Jan 4 • 8
Virgo: A Preliminary Exploration on Reproducing o1-like MLLM Paper • 2501.01904 • Published Jan 3 • 34
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing Paper • 2412.14711 • Published Dec 19, 2024 • 16
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published Mar 24 • 118