Sunshine279
/

gammaPO-llama-3-8b-instruct

alignment-handbook

Generated from Trainer

Model card Files Files and versions Community

Sunshine279 commited on 29 days ago

Commit

680e585

·

verified ·

1 Parent(s): 9cdcf13

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -3,6 +3,7 @@ base_model: Sunshine279/gammaPO-llama-3-8b-instruct
 tags:
 - alignment-handbook
 - generated_from_trainer
 datasets:
 - princeton-nlp/llama3-ultrafeedback-armorm
 model-index:

 tags:
 - alignment-handbook
 - generated_from_trainer
+- arxiv:2506.03690
 datasets:
 - princeton-nlp/llama3-ultrafeedback-armorm
 model-index: