Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2502.18890

Speculative Decoding

about 21 hours ago

FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling

Paper • 2502.14856 • Published 17 days ago • 7
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens

Paper • 2502.18890 • Published 12 days ago • 23
DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting

Paper • 2503.00784 • Published 8 days ago • 9

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 25 days ago • 143
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens

Paper • 2502.18890 • Published 12 days ago • 23
MPO: Boosting LLM Agents with Meta Plan Optimization

Paper • 2503.02682 • Published 6 days ago • 23

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Paper • 2412.11605 • Published Dec 16, 2024 • 18
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 93
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval

Paper • 2412.15443 • Published Dec 19, 2024 • 9

Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement

Paper • 2411.06558 • Published Nov 10, 2024 • 34
SlimLM: An Efficient Small Language Model for On-Device Document Assistance

Paper • 2411.09944 • Published Nov 15, 2024 • 12
Look Every Frame All at Once: Video-Ma^2mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing

Paper • 2411.19460 • Published Nov 29, 2024 • 11
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published Dec 6, 2024 • 47

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 171
PaliGemma 2: A Family of Versatile VLMs for Transfer

Paper • 2412.03555 • Published Dec 4, 2024 • 129
VisionZip: Longer is Better but Not Necessary in Vision Language Models

Paper • 2412.04467 • Published Dec 5, 2024 • 107
o1-Coder: an o1 Replication for Coding

Paper • 2412.00154 • Published Nov 29, 2024 • 43

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs