trl-lib
/
Qwen2-0.5B-Reward-Math-Sheperd
like
1
Follow
TRL
122
Model card
Files
Files and versions
Metrics
Training metrics
Community
1