Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
0-hero
's Collections
R1-GRPO-Math-Python-Code-Experiments
Prompt Perfect
GPT-2 Experiment
Matter-0.1
Matter 0.2
R1-GRPO-Math-Python-Code-Experiments
updated
May 11, 2025
Lora & full finetune experiments on r1 distills to generate python code for math problems
Upvote
-
0-hero/r1-7B-grpo-v3.3-epoch-3
8B
•
Updated
Mar 28, 2025
•
5
0-hero/r1-7B-grpo-v3.3-epoch-2
8B
•
Updated
Mar 28, 2025
•
6
0-hero/r1-7B-grpo-v3.3-epoch-1
8B
•
Updated
Mar 28, 2025
•
8
0-hero/r1-7B-grpo-v3.2-epoch-1
8B
•
Updated
Mar 27, 2025
•
6
0-hero/r1-7B-grpo-v3.2-epoch-2
8B
•
Updated
Mar 27, 2025
•
9
0-hero/r1-14B-grpo-v3.1-epoch-2
15B
•
Updated
Mar 26, 2025
•
5
0-hero/r1-14B-grpo-v3.1-epoch-1
15B
•
Updated
Mar 26, 2025
•
6
0-hero/r1-7B-grpo-v3.1-epoch-3
8B
•
Updated
Mar 24, 2025
•
10
0-hero/r1-7B-grpo-v3.1-epoch-2
8B
•
Updated
Mar 24, 2025
•
7
0-hero/r1-7B-grpo-v2-temp-1.0-60
8B
•
Updated
Mar 23, 2025
•
7
0-hero/r1-14B-math-grpo-165
15B
•
Updated
Mar 12, 2025
•
8
0-hero/r1-14B-math-grpo-80
15B
•
Updated
Mar 11, 2025
•
8
0-hero/r1-7B-grpo-850
8B
•
Updated
Mar 10, 2025
•
10
0-hero/r1-7B-grpo-710
8B
•
Updated
Mar 10, 2025
•
7
0-hero/r1-7B-grpo-610
8B
•
Updated
Mar 10, 2025
•
9
0-hero/r1-7B-grpo-80
8B
•
Updated
Mar 10, 2025
•
7
0-hero/R1-7B-MATH-GRPO-FULL
8B
•
Updated
Mar 9, 2025
•
5
0-hero/R1-14B-GRPO
15B
•
Updated
Mar 8, 2025
•
10
0-hero/r1-7b-grpo-full
8B
•
Updated
Mar 6, 2025
•
7
0-hero/r1-8b-grpo-full
Updated
Mar 6, 2025
Upvote
-
Share collection
View history
Collection guide
Browse collections