📊 CodeForces
Datasets with FULLY VERIFIABLE competitive programming problems, reasoning traces, and human created solutions
Viewer • Updated • 34.8k • 4.5k • 49Note Over 10k problems scrapped from the CodeForces platform (almost all the available problems). Includes: - Text rendering of latex equations (using Qwen-VL) - Problem metadata (tags, difficulty, etc) - Editorials when available - Special model-generated checkers to validate problems with multiple correct answers - Model-generated additional test cases for problems where not all test cases are public
open-r1/codeforces-cots
Viewer • Updated • 254k • 4.77k • 168Note CodeForces-CoTs is a large-scale dataset for training reasoning models on competitive programming tasks. It consists of 10k CodeForces problems with up to five reasoning traces generated by DeepSeek R1
open-r1/codeforces-submissions
Viewer • Updated • 12.7M • 395 • 2Note Dataset containing over 12 million real human submissions to the CodeForces platform