qgallouedec HF staff commited on
Commit
623e295
·
verified ·
1 Parent(s): 4981722

End of training

Browse files
Files changed (3) hide show
  1. README.md +1 -1
  2. model.safetensors +1 -1
  3. training_args.bin +1 -1
README.md CHANGED
@@ -28,7 +28,7 @@ print(output["generated_text"])
28
 
29
  ## Training procedure
30
 
31
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/huggingface/trl/runs/a8jlsgpf)
32
 
33
  This model was trained with DPO, a method introduced in [Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://huggingface.co/papers/2305.18290).
34
 
 
28
 
29
  ## Training procedure
30
 
31
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/huggingface/trl/runs/8g0pylqi)
32
 
33
  This model was trained with DPO, a method introduced in [Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://huggingface.co/papers/2305.18290).
34
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:003da1c30dc253a18402911f38deeba08e9d9b38f1824ec8027f20f8ce7a5db3
3
  size 1976163472
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:68d8c6f8b82e83b09dfe41a5d91a34125a9d7c4e9f4aa1a56c050d70b9b0e562
3
  size 1976163472
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:79df2cd828b1674e03fbff845d93115a2a3f2a98a86d40fc4a3a5383de1f4bb2
3
  size 5944
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:40a26e16277020fec8e3d2a117a6b10b4cdbaad948f42ec443a03a7c477b537e
3
  size 5944