GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning Paper • 2505.17022 • Published 2 days ago • 24
Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction Paper • 2505.11254 • Published 8 days ago • 47
AdaptThink: Reasoning Models Can Learn When to Think Paper • 2505.13417 • Published 5 days ago • 70
Learning Adaptive Parallel Reasoning with Language Models Paper • 2504.15466 • Published Apr 21 • 42
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 159
SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration Paper • 2411.10958 • Published Nov 17, 2024 • 56
NVILA: Efficient Frontier Visual Language Models Paper • 2412.04468 • Published Dec 5, 2024 • 60
view article Article Unbelievable! Run 70B LLM Inference on a Single 4GB GPU with This NEW Technique By lyogavin • Nov 30, 2023 • 34
T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation Paper • 2407.14505 • Published Jul 19, 2024 • 27
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training Paper • 2410.19313 • Published Oct 25, 2024 • 19
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation Paper • 2410.13861 • Published Oct 17, 2024 • 57
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration Paper • 2410.02367 • Published Oct 3, 2024 • 50