Collections
Discover the best community collections!
Collections including paper arxiv:2503.01933
-
On Domain-Specific Post-Training for Multimodal Large Language Models
Paper • 2411.19930 • Published • 27 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 76 -
Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer
Paper • 2503.02495 • Published • 7 -
Fine-Tuning Small Language Models for Domain-Specific AI: An Edge AI Perspective
Paper • 2503.01933 • Published • 10
-
Prompt-to-Leaderboard
Paper • 2502.14855 • Published • 7 -
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
Paper • 2502.16894 • Published • 26 -
Generating Skyline Datasets for Data Science Models
Paper • 2502.11262 • Published • 7 -
Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge
Paper • 2502.12501 • Published • 6
-
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Paper • 2311.11077 • Published • 28 -
Tensor Product Attention Is All You Need
Paper • 2501.06425 • Published • 84 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 35 -
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 63
-
Depth Anything V2
Paper • 2406.09414 • Published • 97 -
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
Paper • 2406.09415 • Published • 51 -
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion
Paper • 2406.04338 • Published • 38 -
SAM 2: Segment Anything in Images and Videos
Paper • 2408.00714 • Published • 113
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 147 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 13 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 56 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47