Spaces:

Dovakiins
/

qwerrwe

Build error

Aleksey Korshuk commited on Jan 23, 2024

Commit

dc051b8

unverified ·

1 Parent(s): 59a31fe

Update rlhf.md (#1178) [skip ci]

Files changed (1) hide show

docs/rlhf.md CHANGED Viewed

@@ -19,14 +19,14 @@ The various RL training methods are implemented in trl and wrapped via axolotl.
 #### DPO
 ```yaml
-rl: true
 datasets:
   - path: Intel/orca_dpo_pairs
     split: train
-    type: intel_apply_chatml
   - path: argilla/ultrafeedback-binarized-preferences
     split: train
-    type: argilla_apply_chatml
 ```
 #### IPO

 #### DPO
 ```yaml
+rl: dpo
 datasets:
   - path: Intel/orca_dpo_pairs
     split: train
+    type: chatml.intel
   - path: argilla/ultrafeedback-binarized-preferences
     split: train
+    type: chatml.argilla
 ```
 #### IPO