Update README.md
Browse files
README.md
CHANGED
@@ -76,13 +76,20 @@ The following hyperparameters were used during pre-training:
|
|
76 |
- training_steps: 300,000
|
77 |
- warmup_steps: 10,000
|
78 |
|
79 |
-
The accuracy of the trained model on the masked language modeling task was 0.
|
80 |
The evaluation set consists of 5,000 randomly sampled documents from each of the training corpora.
|
81 |
|
82 |
## Fine-tuning on NLU tasks
|
83 |
|
84 |
-
|
85 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
86 |
|
87 |
## Acknowledgments
|
88 |
|
|
|
76 |
- training_steps: 300,000
|
77 |
- warmup_steps: 10,000
|
78 |
|
79 |
+
The accuracy of the trained model on the masked language modeling task was 0.799.
|
80 |
The evaluation set consists of 5,000 randomly sampled documents from each of the training corpora.
|
81 |
|
82 |
## Fine-tuning on NLU tasks
|
83 |
|
84 |
+
We fine-tuned the following models and evaluated them on the dev set of JGLUE.
|
85 |
+
We tuned learning rate and training epochs for each model and task following [the JGLUE paper](https://www.jstage.jst.go.jp/article/jnlp/30/1/30_63/_pdf/-char/ja).
|
86 |
+
|
87 |
+
| Model | MARC-ja/acc | JSTS/spearman | JNLI/acc | JSQuAD/EM | JSQuAD/F1 | JComQA/acc |
|
88 |
+
|------------------------------------------|---------------|-----------------|------------|-------------|-------------|--------------|
|
89 |
+
| nlp-waseda/roberta-base-japanese | 0.965 | 0.876 | 0.905 | 0.853 | 0.916 | 0.853 |
|
90 |
+
| nlp-waseda/roberta-large-japanese-seq512 | 0.969 | 0.890 | 0.928 | 0.910 | 0.955 | 0.900 |
|
91 |
+
| ku-nlp/deberta-v2-base-japanese | 0.970 | 0.886 | 0.922 | 0.899 | 0.951 | 0.873 |
|
92 |
+
| ku-nlp/deberta-v2-large-japanese | 0.968 | 0.892 | 0.919 | 0.912 | 0.959 | 0.890 |
|
93 |
|
94 |
## Acknowledgments
|
95 |
|