ku-nlp
/

gpt2-small-japanese-char

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

murawaki commited on Apr 19, 2023

Commit

462bd8e

•

1 Parent(s): 4e1b05b

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -40,7 +40,7 @@ You can also use this model to get the features of a given text.
 ## Vocabulary
-This model has a character-level vocabulary of size 6K. To be precise, rare characters may be split into bytes because we use byte-level byte-pair encoding (BPE). The tokenizer was trained on a small subset of the training data that were converted into a one-character-per-line format so that merge operations never transgressed character boundaries.
 ## Training data
@@ -55,7 +55,7 @@ Also note that Japanese Wikipedia was duplicated 10 times to make the total size
 ## Training procedure
-The training took XX weeks using a single NVIDIA A100 80GB GPU.
 The following hyperparameters were used during pre-training:

 ## Vocabulary
+A character-level vocabulary of size 6K is used. To be precise, rare characters may be split into bytes because byte-level byte-pair encoding (BPE) is used. The BPE tokenizer was trained on a small subset of the training data. Since the data were converted into a one-character-per-line format, merge operations never transgressed character boundaries.
 ## Training data
 ## Training procedure
+The training took about 3 months (with two interruptions) with a single NVIDIA A100 80GB GPU.
 The following hyperparameters were used during pre-training: