Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published Apr 30, 2024 • 78
jonathanjordan21/mos-mamba-18x130m-trainer-dgx-lora-sft-merged Text Generation • Updated Aug 23, 2024 • 16
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale Paper • 2409.08264 • Published Sep 12, 2024 • 49
Differentiable Solver Search for Fast Diffusion Sampling Paper • 2505.21114 • Published 28 days ago • 10