ReTool: Reinforcement Learning for Strategic Tool Use in LLMs Paper • 2504.11536 • Published 9 days ago • 58
Model Optimizer Collection A collection of generative models quantized and optimized with TensorRT Model Optimizer. • 17 items • Updated 1 day ago • 16
AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation Paper • 2503.19693 • Published 30 days ago • 75
Multilingual LLM Evaluation Collection Multilingual Evaluation Benchmarks • 8 items • Updated Mar 3 • 25
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam Paper • 2502.17055 • Published Feb 24 • 18
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published Feb 25 • 74
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference Paper • 2502.18137 • Published Feb 25 • 57
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 155
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models Paper • 2502.09604 • Published Feb 13 • 36
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published Feb 13 • 149
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 226
Transformers Can Navigate Mazes With Multi-Step Prediction Paper • 2412.05117 • Published Dec 6, 2024 • 5