Update README.md
Browse files
README.md
CHANGED
@@ -4,4 +4,8 @@ datasets:
|
|
4 |
- PKU-Alignment/align-anything
|
5 |
base_model:
|
6 |
- Qwen/Qwen2.5-0.5B-Instruct
|
7 |
-
---
|
|
|
|
|
|
|
|
|
|
4 |
- PKU-Alignment/align-anything
|
5 |
base_model:
|
6 |
- Qwen/Qwen2.5-0.5B-Instruct
|
7 |
+
---
|
8 |
+
|
9 |
+
DPO training is performed using the [Align-Anything](https://github.com/PKU-Alignment/align-anything) framework, with the *PKU-Alignment/align-anything* text-to-text dataset.
|
10 |
+
|
11 |
+
DPO training report: https://api.wandb.ai/links/nlp-amct/uifw66p5
|