Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Siheng99
's Collections
🌸RePO
🦋SEALONG
🌸RePO
updated
Jun 6
RePO: Replay-Enhanced Policy Optimization
Upvote
-
Siheng99/Qwen2.5-Math-1.5B-DeepMath-1024samples-GRPO
Text Generation
•
2B
•
Updated
Jun 6
•
3
Siheng99/Qwen2.5-Math-1.5B-DeepMath-1024samples-RePO
Text Generation
•
2B
•
Updated
Jun 6
•
3
Siheng99/Qwen2.5-Math-7B-DeepMath-1024samples-GRPO
Text Generation
•
8B
•
Updated
Jun 6
•
3
Siheng99/Qwen2.5-Math-7B-DeepMath-1024samples-RePO
Text Generation
•
8B
•
Updated
Jun 6
•
5
Siheng99/Qwen3-1.7B-DeepMath-1024samples-GRPO
Text Generation
•
2B
•
Updated
Jun 6
•
3
Siheng99/Qwen3-1.7B-DeepMath-1024samples-RePO
Text Generation
•
2B
•
Updated
Jun 6
•
3
Upvote
-
Share collection
View history
Collection guide
Browse collections