Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
lalalaDa
/
ER-GRPO-alpha99
like
0
Text Generation
Transformers
Safetensors
knoveleng/open-rs
qwen2
Generated from Trainer
ERGRPO
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
main
ER-GRPO-alpha99
Commit History
End of training
6a3dc66
verified
lalalaDa
commited on
Jul 7
Model save
014bb1e
verified
lalalaDa
commited on
Jul 7
Training in progress, step 50
c2b5586
verified
lalalaDa
commited on
Jul 7
initial commit
dc740cd
verified
lalalaDa
commited on
Jul 6