view article Article Training Large Language Models with Interpreter Feedback using WebAssembly By axolotl-ai-co and 1 other β’ Apr 3 β’ 13
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning Paper β’ 2503.05592 β’ Published Mar 7 β’ 27
Training Language Models to Self-Correct via Reinforcement Learning Paper β’ 2409.12917 β’ Published Sep 19, 2024 β’ 139
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Paper β’ 2409.12191 β’ Published Sep 18, 2024 β’ 78
view article Article Efficient Deep Learning: A Comprehensive Overview of Optimization Techniques π π By Isayoften β’ Aug 26, 2024 β’ 67
The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper β’ 2408.15237 β’ Published Aug 27, 2024 β’ 42
view article Article How NuminaMath Won the 1st AIMO Progress Prize By yfleureau and 7 others β’ Jul 11, 2024 β’ 120
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper β’ 2406.14491 β’ Published Jun 20, 2024 β’ 94
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. β’ 39 items β’ Updated Apr 28 β’ 365
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method Paper β’ 2402.17193 β’ Published Feb 27, 2024 β’ 26