Sunshine279
/

gammaPO-gemma-2-9b-it

alignment-handbook

Generated from Trainer

Model card Files Files and versions Community

Sunshine279 commited on about 1 month ago

Commit

e958761

·

verified ·

1 Parent(s): e8556e1

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ tags:
 datasets:
 - princeton-nlp/gemma2-ultrafeedback-armorm
 model-index:
-- name: gemma-2-9b-it-gmsimpo-beta10-gm0.5-tau20-lr8e-7-1220
   results: []
 ---
@@ -14,7 +14,7 @@ model-index:
 should probably proofread and complete it, then remove this comment. -->
 [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](None)
-# gemma-2-9b-it-gmsimpo-beta10-gm0.5-tau20-lr8e-7-1220
 This model is a fine-tuned version of [Sunshine279/gammaPO-gemma-2-9b-it](https://huggingface.co/Sunshine279/gammaPO-gemma-2-9b-it) on the princeton-nlp/gemma2-ultrafeedback-armorm dataset.
 It achieves the following results on the evaluation set:

 datasets:
 - princeton-nlp/gemma2-ultrafeedback-armorm
 model-index:
+- name: gemma-2-9b-it-gmsimpo-beta10-gm0.5-tau20-lr8e-7
   results: []
 ---
 should probably proofread and complete it, then remove this comment. -->
 [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](None)
+# gemma-2-9b-it-gmsimpo-beta10-gm0.5-tau20-lr8e-7
 This model is a fine-tuned version of [Sunshine279/gammaPO-gemma-2-9b-it](https://huggingface.co/Sunshine279/gammaPO-gemma-2-9b-it) on the princeton-nlp/gemma2-ultrafeedback-armorm dataset.
 It achieves the following results on the evaluation set: