The collection for the Paper "SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild"
HKUST NLP Group
university
AI & ML interests
None defined yet.
Recent Activity
View all activity
Collections
7
models
45
hkust-nlp/Llama-3.1-8B-SimpleRL-Zoo
Updated
•
25
hkust-nlp/Qwen-2.5-32B-SimpleRL-Zoo
Updated
•
141
hkust-nlp/Qwen-2.5-7B-SimpleRL-Zoo
Updated
•
385
hkust-nlp/DeepSeek-Math-7B-SimpleRL-Zoo
Updated
•
62
hkust-nlp/Mistral-7B-v0.1-SimpleRL-Zoo
Updated
•
15
hkust-nlp/Qwen-2.5-1.5B-SimpleRL-Zoo
Updated
•
705
hkust-nlp/Qwen-2.5-0.5B-SimpleRL-Zoo
Updated
•
28
hkust-nlp/Qwen-2.5-14B-SimpleRL-Zoo
Updated
•
57
hkust-nlp/Mistral-Small-24B-SimpleRL-Zoo
Updated
•
41
hkust-nlp/Qwen-2.5-Math-7B-SimpleRL-Zoo
Updated
•
3.34k
datasets
22
hkust-nlp/SimpleRL-Zoo-Data
Viewer
•
Updated
•
53.1k
•
682
•
3
hkust-nlp/PreSelect-100B
Viewer
•
Updated
•
54.5M
•
1.11k
•
9
hkust-nlp/CodeIO-PyEdu-Reasoning
Preview
•
Updated
•
224
•
49
hkust-nlp/CodeIO-PyEdu-Reasoning-Raw
Updated
•
122
hkust-nlp/SynCSE-partial-NLI
Viewer
•
Updated
•
263k
•
44
•
2
hkust-nlp/SynCSE-scratch-NLI
Viewer
•
Updated
•
276k
•
52
•
2
hkust-nlp/gsm8k-fix
Viewer
•
Updated
•
7.47k
•
118
•
2
hkust-nlp/dart-math-uniform
Viewer
•
Updated
•
591k
•
78
•
9
hkust-nlp/vrt-baseline
Viewer
•
Updated
•
591k
•
42
•
1
hkust-nlp/dart-math-hard
Viewer
•
Updated
•
585k
•
125
•
14