view article Article Welcome Gemma 2 - Google's new open LLM By philschmid and 5 others • Jun 27, 2024 • 130
view article Article RegMix: Data Mixture as Regression for Language Model Pre-training By SivilTaram • Jul 11, 2024 • 15
view article Article Selective fine-tuning of Language Models with Spectrum By anakin87 • Sep 3, 2024 • 36
Can LLMs Generate High-Quality Test Cases for Algorithm Problems? TestCase-Eval: A Systematic Evaluation of Fault Coverage and Exposure Paper • 2506.12278 • Published 25 days ago • 17
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation By yuxiang630 and 8 others • Apr 29, 2024 • 78
UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Function Paper • 2410.21438 • Published Oct 28, 2024 • 2
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond Paper • 2503.10460 • Published Mar 13 • 29
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 235
Code Evaluation Collection Collection of Papers on Code Evaluation (from code generation language models) • 45 items • Updated Oct 29, 2024 • 15
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated about 20 hours ago • 163
view article Article 🪆 Introduction to Matryoshka Embedding Models By tomaarsen and 2 others • Feb 23, 2024 • 144
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper • 2406.15877 • Published Jun 22, 2024 • 48