view article Article Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models By AI-MO and 17 others • Jul 10 • 48
deepseek-ai/DeepSeek-Prover-V2-671B Text Generation • 685B • Updated Apr 30 • 1.83k • • 809
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published Feb 20 • 193
xukp20/Llama-3-8B-Instruct-SPPO-score-Iter3_gp_8b-table-0.002 Text Generation • 8B • Updated Sep 29, 2024 • 3
xukp20/Llama-3-8B-Instruct-SPPO-score-Iter3_bt_8b-table-0.002 Text Generation • 8B • Updated Sep 28, 2024 • 3
xukp20/Llama-3-8B-Instruct-SPPO-Iter3_bt_8b-table Text Generation • 8B • Updated Sep 28, 2024 • 9
xukp20/Llama-3-8B-Instruct-SPPO-score-Iter3_bt_2b-table-0.001 Text Generation • 8B • Updated Sep 28, 2024 • 4
xukp20/Llama-3-8B-Instruct-SPPO-Iter3_bt_2b-table Text Generation • 8B • Updated Sep 28, 2024 • 6
xukp20/Llama-3-8B-Instruct-SPPO-Iter3_gp_8b-table Text Generation • 8B • Updated Sep 28, 2024 • 2
xukp20/Llama-3-8B-Instruct-SPPO-score-Iter3_gp_2b-table-0.001 Text Generation • 8B • Updated Sep 28, 2024 • 4
xukp20/Llama-3-8B-Instruct-SPPO-Iter3_gp_2b-table Text Generation • 8B • Updated Sep 28, 2024 • 3