roberta-large-sst-2-16-13

This model is a fine-tuned version of roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.3222
Accuracy: 0.8438

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 150

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	1	0.7045	0.5
No log	2.0	2	0.7045	0.5
No log	3.0	3	0.7045	0.5
No log	4.0	4	0.7045	0.5
No log	5.0	5	0.7045	0.5
No log	6.0	6	0.7045	0.5
No log	7.0	7	0.7044	0.5
No log	8.0	8	0.7044	0.5
No log	9.0	9	0.7044	0.5
0.7125	10.0	10	0.7043	0.5
0.7125	11.0	11	0.7043	0.5
0.7125	12.0	12	0.7042	0.5
0.7125	13.0	13	0.7042	0.5
0.7125	14.0	14	0.7041	0.5
0.7125	15.0	15	0.7041	0.5
0.7125	16.0	16	0.7040	0.5
0.7125	17.0	17	0.7040	0.5
0.7125	18.0	18	0.7039	0.5
0.7125	19.0	19	0.7039	0.5
0.6935	20.0	20	0.7038	0.5
0.6935	21.0	21	0.7038	0.5
0.6935	22.0	22	0.7037	0.5
0.6935	23.0	23	0.7037	0.5
0.6935	24.0	24	0.7037	0.5
0.6935	25.0	25	0.7036	0.5
0.6935	26.0	26	0.7036	0.5
0.6935	27.0	27	0.7035	0.5
0.6935	28.0	28	0.7035	0.5
0.6935	29.0	29	0.7034	0.5
0.7031	30.0	30	0.7033	0.5
0.7031	31.0	31	0.7032	0.5
0.7031	32.0	32	0.7031	0.5
0.7031	33.0	33	0.7030	0.5
0.7031	34.0	34	0.7029	0.5
0.7031	35.0	35	0.7027	0.5
0.7031	36.0	36	0.7027	0.5
0.7031	37.0	37	0.7026	0.5
0.7031	38.0	38	0.7025	0.5
0.7031	39.0	39	0.7024	0.5
0.7021	40.0	40	0.7023	0.5
0.7021	41.0	41	0.7022	0.5
0.7021	42.0	42	0.7021	0.5
0.7021	43.0	43	0.7019	0.5
0.7021	44.0	44	0.7017	0.5
0.7021	45.0	45	0.7016	0.5
0.7021	46.0	46	0.7014	0.5
0.7021	47.0	47	0.7012	0.5
0.7021	48.0	48	0.7010	0.5
0.7021	49.0	49	0.7007	0.5
0.7009	50.0	50	0.7005	0.5
0.7009	51.0	51	0.7003	0.5
0.7009	52.0	52	0.7001	0.5
0.7009	53.0	53	0.6998	0.5
0.7009	54.0	54	0.6996	0.5
0.7009	55.0	55	0.6994	0.5
0.7009	56.0	56	0.6993	0.5
0.7009	57.0	57	0.6992	0.5
0.7009	58.0	58	0.6990	0.5
0.7009	59.0	59	0.6988	0.5
0.6866	60.0	60	0.6986	0.5
0.6866	61.0	61	0.6984	0.5
0.6866	62.0	62	0.6983	0.5
0.6866	63.0	63	0.6981	0.5
0.6866	64.0	64	0.6979	0.5
0.6866	65.0	65	0.6977	0.5
0.6866	66.0	66	0.6976	0.4688
0.6866	67.0	67	0.6974	0.4688
0.6866	68.0	68	0.6972	0.4688
0.6866	69.0	69	0.6970	0.4688
0.6818	70.0	70	0.6968	0.4688
0.6818	71.0	71	0.6966	0.4688
0.6818	72.0	72	0.6964	0.4688
0.6818	73.0	73	0.6961	0.4688
0.6818	74.0	74	0.6960	0.4688
0.6818	75.0	75	0.6959	0.4688
0.6818	76.0	76	0.6957	0.4688
0.6818	77.0	77	0.6955	0.4688
0.6818	78.0	78	0.6953	0.4688
0.6818	79.0	79	0.6948	0.4688
0.6639	80.0	80	0.6940	0.4688
0.6639	81.0	81	0.6932	0.4688
0.6639	82.0	82	0.6925	0.4688
0.6639	83.0	83	0.6916	0.4688
0.6639	84.0	84	0.6908	0.5
0.6639	85.0	85	0.6899	0.5
0.6639	86.0	86	0.6889	0.5
0.6639	87.0	87	0.6878	0.5
0.6639	88.0	88	0.6869	0.5
0.6639	89.0	89	0.6859	0.4688
0.6652	90.0	90	0.6850	0.4688
0.6652	91.0	91	0.6842	0.4688
0.6652	92.0	92	0.6836	0.5312
0.6652	93.0	93	0.6829	0.5312
0.6652	94.0	94	0.6818	0.5625
0.6652	95.0	95	0.6806	0.5938
0.6652	96.0	96	0.6792	0.5938
0.6652	97.0	97	0.6783	0.5938
0.6652	98.0	98	0.6771	0.5938
0.6652	99.0	99	0.6758	0.5938
0.621	100.0	100	0.6743	0.5938
0.621	101.0	101	0.6725	0.5938
0.621	102.0	102	0.6711	0.5938
0.621	103.0	103	0.6708	0.5938
0.621	104.0	104	0.6713	0.625
0.621	105.0	105	0.6693	0.5938
0.621	106.0	106	0.6605	0.5938
0.621	107.0	107	0.6499	0.5938
0.621	108.0	108	0.6439	0.5625
0.621	109.0	109	0.6434	0.625
0.5331	110.0	110	0.6439	0.5938
0.5331	111.0	111	0.6418	0.5625
0.5331	112.0	112	0.6388	0.5625
0.5331	113.0	113	0.6346	0.5625
0.5331	114.0	114	0.6307	0.5625
0.5331	115.0	115	0.6275	0.5625
0.5331	116.0	116	0.6230	0.5625
0.5331	117.0	117	0.6144	0.5625
0.5331	118.0	118	0.6092	0.5625
0.5331	119.0	119	0.6042	0.5938
0.4594	120.0	120	0.6006	0.5938
0.4594	121.0	121	0.5971	0.5938
0.4594	122.0	122	0.5906	0.5938
0.4594	123.0	123	0.5841	0.5938
0.4594	124.0	124	0.5759	0.6562
0.4594	125.0	125	0.5682	0.6875
0.4594	126.0	126	0.5566	0.6875
0.4594	127.0	127	0.5431	0.6875
0.4594	128.0	128	0.5314	0.6875
0.4594	129.0	129	0.5221	0.7188
0.33	130.0	130	0.5145	0.7188
0.33	131.0	131	0.5062	0.7188
0.33	132.0	132	0.4988	0.7188
0.33	133.0	133	0.4888	0.7188
0.33	134.0	134	0.4689	0.7188
0.33	135.0	135	0.4586	0.75
0.33	136.0	136	0.4464	0.7812
0.33	137.0	137	0.4330	0.7812
0.33	138.0	138	0.4185	0.7812
0.33	139.0	139	0.4004	0.8125
0.2099	140.0	140	0.3852	0.8125
0.2099	141.0	141	0.3724	0.8125
0.2099	142.0	142	0.3610	0.8125
0.2099	143.0	143	0.3613	0.8125
0.2099	144.0	144	0.3731	0.7812
0.2099	145.0	145	0.3655	0.8125
0.2099	146.0	146	0.3553	0.8125
0.2099	147.0	147	0.3457	0.8125
0.2099	148.0	148	0.3380	0.8438
0.2099	149.0	149	0.3315	0.8438
0.0894	150.0	150	0.3222	0.8438

Framework versions

Transformers 4.32.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.4.0
Tokenizers 0.13.3

simonycl
/

roberta-large-sst-2-16-13

roberta-large-sst-2-16-13

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for simonycl/roberta-large-sst-2-16-13

Evaluation results