ku-nlp
/

deberta-v2-large-japanese

Inference Endpoints

Model card Files Files and versions Community

nobu-g commited on Apr 10, 2023

Commit

8625554

•

1 Parent(s): 82721e7

Update README.md

Files changed (1) hide show

README.md +10 -3

README.md CHANGED Viewed

@@ -76,13 +76,20 @@ The following hyperparameters were used during pre-training:
 - training_steps: 300,000
 - warmup_steps: 10,000
-The accuracy of the trained model on the masked language modeling task was 0.801.
 The evaluation set consists of 5,000 randomly sampled documents from each of the training corpora.
 ## Fine-tuning on NLU tasks
-<!-- https://github.com/yahoojapan/JGLUE -->
-Coming soon.
 ## Acknowledgments

 - training_steps: 300,000
 - warmup_steps: 10,000
+The accuracy of the trained model on the masked language modeling task was 0.799.
 The evaluation set consists of 5,000 randomly sampled documents from each of the training corpora.
 ## Fine-tuning on NLU tasks
+We fine-tuned the following models and evaluated them on the dev set of JGLUE.
+We tuned learning rate and training epochs for each model and task following [the JGLUE paper](https://www.jstage.jst.go.jp/article/jnlp/30/1/30_63/_pdf/-char/ja).
+| Model                                    |   MARC-ja/acc |   JSTS/spearman |   JNLI/acc |   JSQuAD/EM |   JSQuAD/F1 |   JComQA/acc |
+|------------------------------------------|---------------|-----------------|------------|-------------|-------------|--------------|
+| nlp-waseda/roberta-base-japanese         |         0.965 |           0.876 |      0.905 |       0.853 |       0.916 |        0.853 |
+| nlp-waseda/roberta-large-japanese-seq512 |         0.969 |           0.890 |      0.928 |       0.910 |       0.955 |        0.900 |
+| ku-nlp/deberta-v2-base-japanese          |         0.970 |           0.886 |      0.922 |       0.899 |       0.951 |        0.873 |
+| ku-nlp/deberta-v2-large-japanese         |         0.968 |           0.892 |      0.919 |       0.912 |       0.959 |        0.890 |
 ## Acknowledgments