Update README.md
Browse files
README.md
CHANGED
@@ -77,6 +77,8 @@ print(tokenizer.decode(output))
|
|
77 |
|8x1.8b|24|2048|16|8|2|4096|407,498,752|8,858,863,616|2,924,279,808|9,266,362,368|9,266,362,368|
|
78 |
|8x13b|40|5120|40|8|2|4096|1,018,746,880|72,144,081,920|22,200,806,400|73,162,828,800|
|
79 |
|
|
|
|
|
80 |
## Tokenizer
|
81 |
|
82 |
The tokenizer of this model is based on [huggingface/tokenizers](https://github.com/huggingface/tokenizers) Unigram byte-fallback model.
|
|
|
77 |
|8x1.8b|24|2048|16|8|2|4096|407,498,752|8,858,863,616|2,924,279,808|9,266,362,368|9,266,362,368|
|
78 |
|8x13b|40|5120|40|8|2|4096|1,018,746,880|72,144,081,920|22,200,806,400|73,162,828,800|
|
79 |
|
80 |
+
If you would like to learn more about the pretraining of the LLM-jp-3 MoE series, please refer to this [blog post](https://llm-jp.nii.ac.jp/blog/2025/03/27/moe3.html).
|
81 |
+
|
82 |
## Tokenizer
|
83 |
|
84 |
The tokenizer of this model is based on [huggingface/tokenizers](https://github.com/huggingface/tokenizers) Unigram byte-fallback model.
|