Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression Paper • 2505.19433 • Published 13 days ago • 5
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 24 items • Updated 20 days ago • 148
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 268
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning Paper • 2502.06781 • Published Feb 10 • 61
view article Article Accelerate Large Model Training using PyTorch Fully Sharded Data Parallel By smangrul and 1 other • May 2, 2022 • 4
Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing Paper • 2502.04411 • Published Feb 6 • 4
view article Article Token Merging for fast LLM inference : Background and first trials with Mistral By samchain • Apr 30, 2024 • 4
Should We Really Edit Language Models? On the Evaluation of Edited Language Models Paper • 2410.18785 • Published Oct 24, 2024 • 7
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads Paper • 2410.10819 • Published Oct 14, 2024 • 7
LPZero: Language Model Zero-cost Proxy Search from Zero Paper • 2410.04808 • Published Oct 7, 2024 • 2
PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs Paper • 2410.05265 • Published Oct 7, 2024 • 32
view article Article LLM Data Engineering 3——Data Collection Magic: Acquiring Top Training Data By JessyTsu1 • Jun 4, 2024 • 4
view article Article Welcome Gemma 2 - Google's new open LLM By philschmid and 5 others • Jun 27, 2024 • 129
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated May 5 • 227
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models Paper • 2406.02924 • Published Jun 5, 2024 • 2
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA By ybelkada and 4 others • May 24, 2023 • 152