Shakti-VLMs: Scalable Vision-Language Models for Enterprise AI Paper • 2502.17092 • Published 2 days ago • 2
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation Paper • 2502.18364 • Published 1 day ago • 21
AAD-LLM: Neural Attention-Driven Auditory Scene Understanding Paper • 2502.16794 • Published 3 days ago • 4
Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models Paper • 2502.15499 • Published 5 days ago • 12
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published 1 day ago • 34
Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective Paper • 2502.17262 • Published 2 days ago • 14
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference Paper • 2502.18137 • Published 1 day ago • 40
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference Paper • 2502.18411 • Published 1 day ago • 53
M3-AGIQA: Multimodal, Multi-Round, Multi-Aspect AI-Generated Image Quality Assessment Paper • 2502.15167 • Published 6 days ago • 1
TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning Paper • 2502.15425 • Published 5 days ago • 7
Beyond Release: Access Considerations for Generative AI Systems Paper • 2502.16701 • Published 3 days ago • 9
Forecasting Open-Weight AI Model Growth on Hugging Face Paper • 2502.15987 • Published 5 days ago • 9
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam Paper • 2502.17055 • Published 3 days ago • 13
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation Paper • 2502.16707 • Published 3 days ago • 10
Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration Paper • 2502.17110 • Published 2 days ago • 10
RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers Paper • 2502.15894 • Published 5 days ago • 15
Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models Paper • 2502.16033 • Published 5 days ago • 15
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment Paper • 2502.16894 • Published 3 days ago • 20
Slamming: Training a Speech Language Model on One GPU in a Day Paper • 2502.15814 • Published 7 days ago • 49