Update README.md
Browse files
README.md
CHANGED
@@ -14,6 +14,7 @@ inference:
|
|
14 |
---
|
15 |
# Release Notes
|
16 |
* this model is finetuned from mt5-small
|
|
|
17 |
* used a trimmed piece of pontoon dataset that features ja to zh translate part
|
18 |
* also scrambled bunch of the translation from mt5-translation-ja_zh-game-v0.1, which is a large amount of junk for training
|
19 |
|
@@ -23,7 +24,7 @@ inference:
|
|
23 |
|
24 |
# 模型公开声明
|
25 |
* 这个模型由 mt5-translation-ja_zh 继续训练得来
|
26 |
-
|
27 |
* 制作这个模型的原因<br>
|
28 |
尝试使用现有的模型精调,小模型训练速度奇快<br>
|
29 |
* 本模型缺陷<br>
|
|
|
14 |
---
|
15 |
# Release Notes
|
16 |
* this model is finetuned from mt5-small
|
17 |
+
* will use about 1.5G vram, fp16 will be less than 1G(if batch size is small), cpu inference speed is ok anyway
|
18 |
* used a trimmed piece of pontoon dataset that features ja to zh translate part
|
19 |
* also scrambled bunch of the translation from mt5-translation-ja_zh-game-v0.1, which is a large amount of junk for training
|
20 |
|
|
|
24 |
|
25 |
# 模型公开声明
|
26 |
* 这个模型由 mt5-translation-ja_zh 继续训练得来
|
27 |
+
* 使用大于1.5g的显存,fp16载入会小于1G显存(batch拉高会大于1G),使用cpu运作速度也还可以
|
28 |
* 制作这个模型的原因<br>
|
29 |
尝试使用现有的模型精调,小模型训练速度奇快<br>
|
30 |
* 本模型缺陷<br>
|