Update README.md
Browse files
README.md
CHANGED
@@ -97,9 +97,9 @@ TODO
|
|
97 |
Note we use a length-normalized variant of DPO for training.
|
98 |
|
99 |
DPO:
|
100 |
-
- **Learning Rate**: 8E-7 (7B
|
101 |
- **Beta**: 5
|
102 |
-
- **Effective Batch Size:** 128 (7B
|
103 |
- **Max. Sequence Length:** 2048
|
104 |
- **Learning Rate Schedule:** Linear
|
105 |
- **LR Warmup Ratio:** 0.1
|
|
|
97 |
Note we use a length-normalized variant of DPO for training.
|
98 |
|
99 |
DPO:
|
100 |
+
- **Learning Rate**: 8E-7 (7B, 13B)
|
101 |
- **Beta**: 5
|
102 |
+
- **Effective Batch Size:** 128 (7B, 13B)
|
103 |
- **Max. Sequence Length:** 2048
|
104 |
- **Learning Rate Schedule:** Linear
|
105 |
- **LR Warmup Ratio:** 0.1
|