OpenCodeInstruct: A Large-scale Instruction Tuning Dataset for Code LLMs Paper • 2504.04030 • Published 18 days ago
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding Paper • 2503.02951 • Published Mar 4 • 30
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper • 2406.15877 • Published Jun 22, 2024 • 47
Genetic Instruct: Scaling up Synthetic Generation of Coding Instructions for Large Language Models Paper • 2407.21077 • Published Jul 29, 2024 • 1
Scoring Verifiers: Evaluating Synthetic Verification in Code and Reasoning Paper • 2502.13820 • Published Feb 19