From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens Paper • 2502.18890 • Published 12 days ago • 23
CLIPPER: Compression enables long-context synthetic data generation Paper • 2502.14854 • Published 18 days ago • 7
Jamba 1.5 Collection The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated 4 days ago • 85
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 233