ll922 commited on
Commit
85d95d6
·
verified ·
1 Parent(s): 49a77d0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -4,4 +4,8 @@ datasets:
4
  - PKU-Alignment/align-anything
5
  base_model:
6
  - Qwen/Qwen2.5-0.5B-Instruct
7
- ---
 
 
 
 
 
4
  - PKU-Alignment/align-anything
5
  base_model:
6
  - Qwen/Qwen2.5-0.5B-Instruct
7
+ ---
8
+
9
+ DPO training is performed using the [Align-Anything](https://github.com/PKU-Alignment/align-anything) framework, with the *PKU-Alignment/align-anything* text-to-text dataset.
10
+
11
+ DPO training report: https://api.wandb.ai/links/nlp-amct/uifw66p5