topic_classifier

This model is a fine-tuned version of distilbert/distilbert-base-multilingual-cased on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.7818	1.0	1859	0.6291	0.8493
0.5836	2.0	3718	0.5473	0.8644
0.4596	3.0	5577	0.5054	0.8795
0.3349	4.0	7436	0.5441	0.8721
0.2628	5.0	9295	0.5577	0.8783
0.2211	6.0	11154	0.5833	0.8810
0.1565	7.0	13013	0.6394	0.8761
0.1123	8.0	14872	0.6576	0.8847
0.0968	9.0	16731	0.7625	0.8798
0.0715	10.0	18590	0.8095	0.8835
0.0534	11.0	20449	0.9209	0.8807
0.0396	12.0	22308	0.9243	0.8823
0.0372	13.0	24167	0.9515	0.8835
0.0281	14.0	26026	1.0376	0.8798
0.0254	15.0	27885	1.0709	0.8854
0.0222	16.0	29744	1.0803	0.8881
0.0224	17.0	31603	1.1030	0.8820
0.0218	18.0	33462	1.1514	0.8795
0.0151	19.0	35321	1.1943	0.8807
0.0154	20.0	37180	1.2014	0.8826
0.012	21.0	39039	1.2208	0.8820
0.009	22.0	40898	1.2181	0.8804
0.0087	23.0	42757	1.1848	0.8838
0.0128	24.0	44616	1.1899	0.8829
0.0108	25.0	46475	1.2150	0.8860
0.009	26.0	48334	1.2330	0.8857
0.0118	27.0	50193	1.2174	0.8891
0.0104	28.0	52052	1.1944	0.8881
0.0049	29.0	53911	1.2085	0.8847
0.0063	30.0	55770	1.2342	0.8894
0.0075	31.0	57629	1.2276	0.8884
0.0035	32.0	59488	1.2319	0.8875
0.006	33.0	61347	1.2193	0.8860
0.0048	34.0	63206	1.2208	0.8863
0.0067	35.0	65065	1.2108	0.8857
0.0024	36.0	66924	1.2278	0.8884
0.003	37.0	68783	1.2291	0.8878
0.0024	38.0	70642	1.2284	0.8891
0.0033	39.0	72501	1.2153	0.8878
0.0046	40.0	74360	1.2182	0.8869