-
FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling
Paper • 2502.14856 • Published • 7 -
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens
Paper • 2502.18890 • Published • 23 -
DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting
Paper • 2503.00784 • Published • 9
Collections
Discover the best community collections!
Collections including paper arxiv:2502.18890
-
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU
Paper • 2502.08910 • Published • 143 -
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens
Paper • 2502.18890 • Published • 23 -
MPO: Boosting LLM Agents with Meta Plan Optimization
Paper • 2503.02682 • Published • 23
-
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Paper • 2412.11605 • Published • 18 -
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper • 2412.09871 • Published • 93 -
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
Paper • 2412.17739 • Published • 41 -
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval
Paper • 2412.15443 • Published • 9
-
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement
Paper • 2411.06558 • Published • 34 -
SlimLM: An Efficient Small Language Model for On-Device Document Assistance
Paper • 2411.09944 • Published • 12 -
Look Every Frame All at Once: Video-Ma^2mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing
Paper • 2411.19460 • Published • 11 -
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
Paper • 2412.05237 • Published • 47
-
Differential Transformer
Paper • 2410.05258 • Published • 171 -
PaliGemma 2: A Family of Versatile VLMs for Transfer
Paper • 2412.03555 • Published • 129 -
VisionZip: Longer is Better but Not Necessary in Vision Language Models
Paper • 2412.04467 • Published • 107 -
o1-Coder: an o1 Replication for Coding
Paper • 2412.00154 • Published • 43