Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Sunshine279
/
gammaPO-llama-3-8b-instruct
like
0
Safetensors
princeton-nlp/llama3-ultrafeedback-armorm
llama
alignment-handbook
Generated from Trainer
arxiv:
2506.03690
License:
mit
Model card
Files
Files and versions
Community
main
gammaPO-llama-3-8b-instruct
/
README.md
Commit History
Update README.md
bfab24b
verified
Sunshine279
commited on
28 days ago
Update README.md
a52f733
verified
Sunshine279
commited on
28 days ago
Update README.md
680e585
verified
Sunshine279
commited on
29 days ago
上传 README.md
b3642b2
verified
Sunshine279
commited on
29 days ago