Update README.md
Browse files
README.md
CHANGED
@@ -14,16 +14,16 @@ datasets:
|
|
14 |
|
15 |
# OLMo-2-1124-7B-DPO
|
16 |
|
17 |
-
OLMo
|
18 |
Tülu 3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.
|
19 |
-
Check out
|
20 |
|
21 |
OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
|
22 |
These models are trained on the Dolma dataset. We are releasing all code, checkpoints, logs (coming soon), and associated training details.
|
23 |
The core models released in this batch include the following:
|
24 |
|
25 |
|
26 |
-
| **Stage** | **OLMo
|
27 |
|----------------------|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|
|
28 |
| **Base Model** | [allenai/OLMo2-7B-1124](https://huggingface.co/allenai/OLMo2-7B-1124) | [allenai/OLMo-2-13B-1124](https://huggingface.co/allenai/OLMo-2-13B-1124) |
|
29 |
| **SFT** | [allenai/OLMo-2-1124-7B-SFT](https://huggingface.co/allenai/OLMo-2-1124-7B-SFT) | [allenai/OLMo-2-1124-13B-SFT](https://huggingface.co/allenai/OLMo-2-1124-13B-SFT) |
|
@@ -86,7 +86,7 @@ The model has not been trained with a specific system prompt in mind.
|
|
86 |
|
87 |
### Bias, Risks, and Limitations
|
88 |
|
89 |
-
The OLMo
|
90 |
See the Falcon 180B model card for an example of this.
|
91 |
|
92 |
|
@@ -109,13 +109,14 @@ DPO:
|
|
109 |
|
110 |
## License and use
|
111 |
|
112 |
-
OLMo
|
113 |
-
OLMo
|
114 |
For more information, please see our [Responsible Use Guidelines](https://allenai.org/responsible-use).
|
|
|
115 |
|
116 |
## Citation
|
117 |
|
118 |
-
If OLMo
|
119 |
```
|
120 |
TODO
|
121 |
```
|
|
|
14 |
|
15 |
# OLMo-2-1124-7B-DPO
|
16 |
|
17 |
+
OLMo 2 7B DPO November 2024 is post-trained variant of the [OLMo 2 7B November 2024](https://huggingface.co/allenai/OLMo2-7B-1124) model, which has undergone supervised finetuning on the [Tülu 3 dataset](https://huggingface.co/datasets/allenai/tulu-3-sft-olmo-2-mixture) and further DPO training on [this dataset](allenai/olmo-2-1124-7b-preference-mix).
|
18 |
Tülu 3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.
|
19 |
+
Check out OLMo 2 paper (forthcoming) or [Tülu 3 paper](https://arxiv.org/abs/2411.15124) for more details!
|
20 |
|
21 |
OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
|
22 |
These models are trained on the Dolma dataset. We are releasing all code, checkpoints, logs (coming soon), and associated training details.
|
23 |
The core models released in this batch include the following:
|
24 |
|
25 |
|
26 |
+
| **Stage** | **OLMo 2 7B** | **OLMo 2 13B** |
|
27 |
|----------------------|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|
|
28 |
| **Base Model** | [allenai/OLMo2-7B-1124](https://huggingface.co/allenai/OLMo2-7B-1124) | [allenai/OLMo-2-13B-1124](https://huggingface.co/allenai/OLMo-2-13B-1124) |
|
29 |
| **SFT** | [allenai/OLMo-2-1124-7B-SFT](https://huggingface.co/allenai/OLMo-2-1124-7B-SFT) | [allenai/OLMo-2-1124-13B-SFT](https://huggingface.co/allenai/OLMo-2-1124-13B-SFT) |
|
|
|
86 |
|
87 |
### Bias, Risks, and Limitations
|
88 |
|
89 |
+
The OLMo 2 models have limited safety training, but are not deployed automatically with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so).
|
90 |
See the Falcon 180B model card for an example of this.
|
91 |
|
92 |
|
|
|
109 |
|
110 |
## License and use
|
111 |
|
112 |
+
OLMo 2 is licensed under the Apache 2.0 license.
|
113 |
+
OLMo 2 is intended for research and educational use.
|
114 |
For more information, please see our [Responsible Use Guidelines](https://allenai.org/responsible-use).
|
115 |
+
This model has been fine-tuned using a dataset mix with outputs generated from third party models and are subject to additional terms: [Gemma Terms of Use](https://ai.google.dev/gemma/terms).
|
116 |
|
117 |
## Citation
|
118 |
|
119 |
+
If OLMo 2 or any of the related materials were helpful to your work, please cite:
|
120 |
```
|
121 |
TODO
|
122 |
```
|