Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ library_name: transformers
|
|
17 |
> FlexOlmo-7x7B-1T (without router training) is a Mixture-of-Experts with 33B total parameters, combining independently trained experts on public-mix, news, math, code, academic texts, creative writing, and Reddit data. The public-mix expert is trained on 1T tokens of public data while the other experts are branched from the public-mix expert and trained on 50B tokens of their respective data.
|
18 |
|
19 |
This information and more can also be found:
|
20 |
-
- **Paper**: https://allenai.org/papers/
|
21 |
- **Code**: https://github.com/allenai/FlexOlmo
|
22 |
- **Blog**: https://allenai.org/blog/flexolmo
|
23 |
- **Data and corresponding models**:
|
@@ -72,6 +72,6 @@ print(tokenizer.decode(out[0]))
|
|
72 |
eprint={2507.00000},
|
73 |
archivePrefix={arXiv},
|
74 |
primaryClass={cs.CL},
|
75 |
-
url={https://allenai.org/papers/
|
76 |
}
|
77 |
```
|
|
|
17 |
> FlexOlmo-7x7B-1T (without router training) is a Mixture-of-Experts with 33B total parameters, combining independently trained experts on public-mix, news, math, code, academic texts, creative writing, and Reddit data. The public-mix expert is trained on 1T tokens of public data while the other experts are branched from the public-mix expert and trained on 50B tokens of their respective data.
|
18 |
|
19 |
This information and more can also be found:
|
20 |
+
- **Paper**: https://allenai.org/papers/flexolmo
|
21 |
- **Code**: https://github.com/allenai/FlexOlmo
|
22 |
- **Blog**: https://allenai.org/blog/flexolmo
|
23 |
- **Data and corresponding models**:
|
|
|
72 |
eprint={2507.00000},
|
73 |
archivePrefix={arXiv},
|
74 |
primaryClass={cs.CL},
|
75 |
+
url={https://allenai.org/papers/flexolmo},
|
76 |
}
|
77 |
```
|