open-r1-eval-leaderboard / eval_results

Commit History

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.02-soft-format-0.5-step-000000136/math_500/results_2025-04-08T12-24-36.598879.json with huggingface_hub
9e596ec
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.02-soft-format-0.5-step-000000068/math_500/results_2025-04-08T11-58-04.111324.json with huggingface_hub
d1194da
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.02-soft-format-0.25-step-000000068/math_500/results_2025-04-08T11-57-54.624245.json with huggingface_hub
468fdbb
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/main/aime24/results_2025-04-08T09-47-47.072526.json with huggingface_hub
9a0988c
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Llama-70B/main/aime24/results_2025-04-08T08-54-46.993714.json with huggingface_hub
447fd5e
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B/main/aime24/results_2025-04-08T08-47-20.321653.json with huggingface_hub
cf16c43
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B/main/aime24/results_2025-04-08T08-34-32.068250.json with huggingface_hub
1014a1b
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Llama-8B/main/aime24/results_2025-04-08T08-14-49.895932.json with huggingface_hub
e63a6a7
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B/main/aime24/results_2025-04-08T08-10-57.713537.json with huggingface_hub
dba40bb
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.02-soft-format-step-000000340/math_500/results_2025-04-08T05-36-38.673745.json with huggingface_hub
b6f8815
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.02-soft-format-step-000000272/math_500/results_2025-04-08T05-26-27.791202.json with huggingface_hub
601cf1e
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.11-step-000000675/math_500/results_2025-04-08T05-16-12.144302.json with huggingface_hub
5997ca4
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.02-soft-format-step-000000204/math_500/results_2025-04-08T05-06-06.959169.json with huggingface_hub
f86c14f
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.11-step-000000540/math_500/results_2025-04-08T04-55-58.006880.json with huggingface_hub
974ed45
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.02-soft-format-step-000000136/math_500/results_2025-04-08T04-45-34.303242.json with huggingface_hub
cac6d1f
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.11-step-000000405/math_500/results_2025-04-07T17-07-04.682777.json with huggingface_hub
8e83073
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Llama-70B/main/aime25/results_2025-04-07T16-06-33.797969.json with huggingface_hub
b733fc6
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.02-soft-format-step-000000068/math_500/results_2025-04-07T15-58-50.252869.json with huggingface_hub
aa699c0
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Llama-70B/main/aime24/results_2025-04-07T15-51-42.681364.json with huggingface_hub
c9f9c1e
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/main/aime25/results_2025-04-07T15-35-31.012311.json with huggingface_hub
61841c1
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/main/aime24/results_2025-04-07T15-12-36.081710.json with huggingface_hub
ba6c7b1
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.11-step-000000270/math_500/results_2025-04-07T15-01-33.462657.json with huggingface_hub
8094f2a
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Llama-8B/main/aime24/results_2025-04-07T14-58-41.452716.json with huggingface_hub
2d98bea
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Llama-8B/main/aime25/results_2025-04-07T14-57-37.803136.json with huggingface_hub
996475f
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B/main/aime25/results_2025-04-07T14-43-35.438418.json with huggingface_hub
70284e3
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B/main/aime24/results_2025-04-07T14-41-15.505675.json with huggingface_hub
2356d83
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B/main/aime25/results_2025-04-07T14-25-45.053213.json with huggingface_hub
45d9fe8
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B/main/aime24/results_2025-04-07T14-24-09.428398.json with huggingface_hub
2cc36a0
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B/main/aime24/results_2025-04-07T14-19-19.172638.json with huggingface_hub
5a71aba
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B/main/aime25/results_2025-04-07T14-18-32.647417.json with huggingface_hub
8c0c742
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B/main/gpqa/results_2025-04-07T14-12-36.470741.json with huggingface_hub
1168bc1
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B/main/math_500/results_2025-04-07T14-12-31.175128.json with huggingface_hub
7fc6994
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B/main/math_500/results_2025-04-07T14-07-21.597623.json with huggingface_hub
62a50dd
verified

lewtun HF Staff commited on

Upload eval_results/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B/main/gpqa/results_2025-04-07T14-04-14.006882.json with huggingface_hub
619fbe8
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.11-step-000000135/math_500/results_2025-04-07T12-07-35.395049.json with huggingface_hub
c5aea4e
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.05-mask-True-step-000001620/math_500/results_2025-04-07T03-57-27.431513.json with huggingface_hub
3ab4b23
verified

edbeeching HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.05-mask-False-step-000001350/math_500/results_2025-04-07T01-50-39.134238.json with huggingface_hub
33192a7
verified

edbeeching HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.10-step-000000270/math_500/results_2025-04-07T01-39-20.602528.json with huggingface_hub
4060325
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.05-mask-True-step-000001350/math_500/results_2025-04-06T15-43-36.524143.json with huggingface_hub
3efb8fe
verified

edbeeching HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.05-mask-True-step-000001080/math_500/results_2025-04-06T15-31-18.666387.json with huggingface_hub
08b5f83
verified

edbeeching HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.05-norm-max-tokens-True-lr-2.0e-6-step-000000270/math_500/results_2025-04-06T15-30-47.002216.json with huggingface_hub
d357014
verified

edbeeching HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.09-step-000000204/math_500/results_2025-04-06T15-30-25.769652.json with huggingface_hub
faa0bc6
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.09-step-000000136/math_500/results_2025-04-06T15-28-16.200352.json with huggingface_hub
4ac8cec
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.09-step-000000068/math_500/results_2025-04-06T13-14-07.631785.json with huggingface_hub
4900fa3
verified

lewtun HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.05-norm-max-tokens-True-lr-1.0e-6-step-000000810/math_500/results_2025-04-06T13-02-49.421472.json with huggingface_hub
5dfc3a3
verified

edbeeching HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.05-disable_dropout-True-step-000000540/math_500/results_2025-04-06T13-01-48.992692.json with huggingface_hub
75c3c65
verified

edbeeching HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.05-disable_dropout-False-step-000000540/math_500/results_2025-04-06T12-57-31.345559.json with huggingface_hub
6aff1b5
verified

edbeeching HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.05-mask-True-step-000000540/math_500/results_2025-04-06T12-56-09.624887.json with huggingface_hub
48be99a
verified

edbeeching HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.05-mask-True-step-000000270/math_500/results_2025-04-05T19-53-18.381738.json with huggingface_hub
502c7c1
verified

edbeeching HF Staff commited on

Upload eval_results/open-r1/R1-Zero-Qwen-7B-Math/v00.06-step-000000674/math_500/results_2025-04-04T13-06-55.834687.json with huggingface_hub
160f7c1
verified

lewtun HF Staff commited on