allenai
/

Flex-public-7B-1T

Mixture of Experts

Model card Files Files and versions

akshitab commited on 1 day ago

Commit

49305e0

·

verified ·

1 Parent(s): e69350f

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ library_name: transformers
 > FlexOlmo-7x7B-1T (without router training) is a Mixture-of-Experts with 33B total parameters, combining independently trained experts on public-mix, news, math, code, academic texts, creative writing, and Reddit data. The public-mix expert is trained on 1T tokens of public data while the other experts are branched from the public-mix expert and trained on 50B tokens of their respective data.
 This information and more can also be found:
-- **Paper**: https://allenai.org/papers/FlexOlmo
 - **Code**: https://github.com/allenai/FlexOlmo
 - **Blog**: https://allenai.org/blog/flexolmo
 - **Data and corresponding models**:

 > FlexOlmo-7x7B-1T (without router training) is a Mixture-of-Experts with 33B total parameters, combining independently trained experts on public-mix, news, math, code, academic texts, creative writing, and Reddit data. The public-mix expert is trained on 1T tokens of public data while the other experts are branched from the public-mix expert and trained on 50B tokens of their respective data.
 This information and more can also be found:
+- **Paper**: https://allenai.org/papers/flexolmo
 - **Code**: https://github.com/allenai/FlexOlmo
 - **Blog**: https://allenai.org/blog/flexolmo
 - **Data and corresponding models**: