DyCodeEval - a CM Collection

CM 's Collections

updated Jun 27

DyCodeEval (ICML 2025) enables dynamic benchmarking for code LLMs. This collection features dynamic HumanEval and MBPP sets generated with Claude 3.5.

Upvote

CM/Dynamic_HumanEvalZero

Viewer • Updated Jun 23 • 15.7k • 54
CM/Dynamic_MBPP_sanitized

Viewer • Updated Jun 23 • 15.8k • 26
Dynamic Benchmarking of Reasoning Capabilities in Code Large Language Models Under Data Contamination

Paper • 2503.04149 • Published Mar 6 • 6

Upvote

Collection guide
Browse collections