Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
phd2023
/
Qwen-2.5-7B-Simple-RL
like
0
Text Generation
Transformers
Safetensors
qwen2
Generated from Trainer
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
Qwen-2.5-7B-Simple-RL
Commit History
Model save
d839f1c
verified
phd2023
commited on
Mar 25
Model save
c776cd3
verified
phd2023
commited on
Mar 22
Model save
4f1ae5a
verified
phd2023
commited on
Mar 20
Model save
a677e34
verified
phd2023
commited on
Mar 20
initial commit
245dc51
verified
phd2023
commited on
Mar 6