metadata

license: cc-by-sa-4.0
tags:
  - generated_from_trainer
datasets:
  - te_dx_jp
model-index:
  - name: t5-base-TEDxJP-0front-1body-10rear-order-RB
    results: []

t5-base-TEDxJP-0front-1body-10rear-order-RB

This model is a fine-tuned version of sonoisa/t5-base-japanese on the te_dx_jp dataset. It achieves the following results on the evaluation set:

Loss: 0.4705
Wer: 0.1772
Mer: 0.1711
Wil: 0.2598
Wip: 0.7402
Hits: 55441
Substitutions: 6558
Deletions: 2588
Insertions: 2296
Cer: 0.1388

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 32
eval_batch_size: 32
seed: 10
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Mer	Wil	Wip	Hits	Substitutions	Deletions	Insertions	Cer
0.6067	1.0	1457	0.4967	0.2034	0.1934	0.2844	0.7156	54800	6821	2966	3351	0.1679
0.579	2.0	2914	0.4534	0.1882	0.1805	0.2697	0.7303	55162	6619	2806	2728	0.1546
0.4934	3.0	4371	0.4463	0.1768	0.1710	0.2592	0.7408	55362	6496	2729	2197	0.1396
0.4371	4.0	5828	0.4444	0.1766	0.1707	0.2580	0.7420	55381	6417	2789	2197	0.1387
0.3917	5.0	7285	0.4450	0.1771	0.1711	0.2595	0.7405	55415	6520	2652	2269	0.1389
0.3614	6.0	8742	0.4516	0.1775	0.1714	0.2592	0.7408	55443	6481	2663	2323	0.1379
0.375	7.0	10199	0.4568	0.1777	0.1715	0.2593	0.7407	55418	6475	2694	2306	0.1396
0.3615	8.0	11656	0.4622	0.1764	0.1706	0.2585	0.7415	55380	6472	2735	2188	0.1382
0.3129	9.0	13113	0.4678	0.1770	0.1709	0.2592	0.7408	55474	6524	2589	2318	0.1385
0.3082	10.0	14570	0.4705	0.1772	0.1711	0.2598	0.7402	55441	6558	2588	2296	0.1388

Framework versions

Transformers 4.21.2
Pytorch 1.12.1+cu116
Datasets 2.4.0
Tokenizers 0.12.1