Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
lalalaDa
/
ERPER-GRPO-alpha99
like
0
Text Generation
Transformers
Safetensors
knoveleng/open-rs
qwen2
Generated from Trainer
ERGRPO
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
main
ERPER-GRPO-alpha99
Commit History
End of training
99c7ea5
verified
lalalaDa
commited on
Jul 9
Model save
31f5da3
verified
lalalaDa
commited on
Jul 9
Training in progress, step 500
799b74a
verified
lalalaDa
commited on
Jul 9
Training in progress, step 450
1401e3f
verified
lalalaDa
commited on
Jul 9
Training in progress, step 400
e5fa7bc
verified
lalalaDa
commited on
Jul 8
Training in progress, step 350
199b3a4
verified
lalalaDa
commited on
Jul 8
Training in progress, step 300
da27146
verified
lalalaDa
commited on
Jul 8
Training in progress, step 250
6c3de9d
verified
lalalaDa
commited on
Jul 8
Training in progress, step 200
3aa01b3
verified
lalalaDa
commited on
Jul 8
Training in progress, step 150
73fad5f
verified
lalalaDa
commited on
Jul 8
Training in progress, step 100
8f888a2
verified
lalalaDa
commited on
Jul 8
Training in progress, step 50
ba69f9f
verified
lalalaDa
commited on
Jul 8
initial commit
2a4ff3d
verified
lalalaDa
commited on
Jul 8