bert-large-uncased-sst-2-64-13

This model is a fine-tuned version of bert-large-uncased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.7922
Accuracy: 0.9062

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 50
num_epochs: 150

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	4	0.7522	0.4922
No log	2.0	8	0.7425	0.4922
0.76	3.0	12	0.7340	0.4922
0.76	4.0	16	0.7249	0.4922
0.7156	5.0	20	0.7169	0.4922
0.7156	6.0	24	0.7071	0.4922
0.7156	7.0	28	0.6967	0.4922
0.696	8.0	32	0.6778	0.4922
0.696	9.0	36	0.6520	0.5391
0.6324	10.0	40	0.6192	0.6562
0.6324	11.0	44	0.5962	0.7109
0.6324	12.0	48	0.5862	0.6953
0.5297	13.0	52	0.5024	0.8359
0.5297	14.0	56	0.4287	0.8438
0.3191	15.0	60	0.3940	0.8281
0.3191	16.0	64	0.3352	0.8828
0.3191	17.0	68	0.3640	0.8359
0.1373	18.0	72	0.2822	0.9062
0.1373	19.0	76	0.2677	0.9062
0.0624	20.0	80	0.2650	0.9219
0.0624	21.0	84	0.2758	0.9141
0.0624	22.0	88	0.2662	0.9141
0.0257	23.0	92	0.3016	0.9141
0.0257	24.0	96	0.3611	0.8906
0.0118	25.0	100	0.3683	0.8984
0.0118	26.0	104	0.3733	0.8984
0.0118	27.0	108	0.3953	0.8984
0.0065	28.0	112	0.4194	0.8984
0.0065	29.0	116	0.4195	0.8984
0.0042	30.0	120	0.4249	0.8984
0.0042	31.0	124	0.4360	0.9062
0.0042	32.0	128	0.4412	0.9062
0.0033	33.0	132	0.4467	0.9062
0.0033	34.0	136	0.4550	0.9062
0.0026	35.0	140	0.4652	0.9062
0.0026	36.0	144	0.4725	0.9062
0.0026	37.0	148	0.4796	0.9062
0.0021	38.0	152	0.4906	0.9062
0.0021	39.0	156	0.5007	0.9062
0.0019	40.0	160	0.5109	0.9062
0.0019	41.0	164	0.5194	0.9062
0.0019	42.0	168	0.5274	0.9062
0.0014	43.0	172	0.5348	0.9062
0.0014	44.0	176	0.5408	0.9062
0.0012	45.0	180	0.5484	0.9062
0.0012	46.0	184	0.5577	0.9062
0.0012	47.0	188	0.5688	0.9062
0.0009	48.0	192	0.5802	0.8984
0.0009	49.0	196	0.5905	0.8984
0.0007	50.0	200	0.6000	0.8984
0.0007	51.0	204	0.6085	0.8984
0.0007	52.0	208	0.6164	0.8984
0.0006	53.0	212	0.6250	0.8984
0.0006	54.0	216	0.6326	0.8984
0.0005	55.0	220	0.6389	0.8984
0.0005	56.0	224	0.6453	0.8984
0.0005	57.0	228	0.6451	0.8984
0.0005	58.0	232	0.6473	0.9062
0.0005	59.0	236	0.6512	0.9062
0.0003	60.0	240	0.6561	0.9062
0.0003	61.0	244	0.6620	0.9062
0.0003	62.0	248	0.6680	0.9062
0.0003	63.0	252	0.6736	0.9062
0.0003	64.0	256	0.6788	0.9062
0.0003	65.0	260	0.6836	0.9062
0.0003	66.0	264	0.6880	0.9062
0.0003	67.0	268	0.6923	0.9062
0.0002	68.0	272	0.6954	0.9062
0.0002	69.0	276	0.6983	0.9062
0.0002	70.0	280	0.7008	0.9062
0.0002	71.0	284	0.7032	0.9062
0.0002	72.0	288	0.7059	0.9062
0.0002	73.0	292	0.7085	0.9062
0.0002	74.0	296	0.7112	0.9062
0.0002	75.0	300	0.7144	0.9062
0.0002	76.0	304	0.7173	0.9062
0.0002	77.0	308	0.7199	0.9062
0.0002	78.0	312	0.7223	0.9062
0.0002	79.0	316	0.7247	0.9062
0.0002	80.0	320	0.7272	0.9062
0.0002	81.0	324	0.7295	0.9062
0.0002	82.0	328	0.7318	0.9062
0.0001	83.0	332	0.7341	0.9062
0.0001	84.0	336	0.7362	0.9062
0.0001	85.0	340	0.7383	0.9062
0.0001	86.0	344	0.7402	0.9062
0.0001	87.0	348	0.7417	0.9062
0.0001	88.0	352	0.7430	0.9062
0.0001	89.0	356	0.7445	0.9062
0.0001	90.0	360	0.7458	0.9062
0.0001	91.0	364	0.7470	0.9062
0.0001	92.0	368	0.7463	0.9062
0.0001	93.0	372	0.7463	0.9062
0.0001	94.0	376	0.7466	0.9062
0.0001	95.0	380	0.7472	0.9062
0.0001	96.0	384	0.7469	0.9062
0.0001	97.0	388	0.7472	0.9062
0.0001	98.0	392	0.7480	0.9062
0.0001	99.0	396	0.7488	0.9062
0.0001	100.0	400	0.7501	0.9062
0.0001	101.0	404	0.7514	0.9062
0.0001	102.0	408	0.7527	0.9062
0.0001	103.0	412	0.7539	0.9062
0.0001	104.0	416	0.7551	0.9062
0.0001	105.0	420	0.7563	0.9062
0.0001	106.0	424	0.7575	0.9062
0.0001	107.0	428	0.7584	0.9062
0.0001	108.0	432	0.7593	0.9062
0.0001	109.0	436	0.7603	0.9062
0.0001	110.0	440	0.7612	0.9062
0.0001	111.0	444	0.7622	0.9062
0.0001	112.0	448	0.7631	0.9062
0.0001	113.0	452	0.7640	0.9062
0.0001	114.0	456	0.7650	0.9062
0.0001	115.0	460	0.7659	0.9062
0.0001	116.0	464	0.7669	0.9062
0.0001	117.0	468	0.7677	0.9062
0.0001	118.0	472	0.7686	0.9062
0.0001	119.0	476	0.7693	0.9062
0.0001	120.0	480	0.7701	0.9062
0.0001	121.0	484	0.7708	0.9062
0.0001	122.0	488	0.7756	0.9062
0.0015	123.0	492	0.7777	0.9062
0.0015	124.0	496	0.7776	0.9062
0.0001	125.0	500	0.7776	0.9062
0.0001	126.0	504	0.7780	0.9062
0.0001	127.0	508	0.7786	0.9062
0.0001	128.0	512	0.7794	0.9062
0.0001	129.0	516	0.7803	0.9062
0.0002	130.0	520	0.7822	0.9062
0.0002	131.0	524	0.7843	0.9062
0.0002	132.0	528	0.7859	0.9062
0.0001	133.0	532	0.7871	0.9062
0.0001	134.0	536	0.7880	0.9062
0.0001	135.0	540	0.7887	0.9062
0.0001	136.0	544	0.7894	0.9062
0.0001	137.0	548	0.7899	0.9062
0.0001	138.0	552	0.7903	0.9062
0.0001	139.0	556	0.7907	0.9062
0.0001	140.0	560	0.7910	0.9062
0.0001	141.0	564	0.7912	0.9062
0.0001	142.0	568	0.7914	0.9062
0.0001	143.0	572	0.7916	0.9062
0.0001	144.0	576	0.7918	0.9062
0.0001	145.0	580	0.7919	0.9062
0.0001	146.0	584	0.7920	0.9062
0.0001	147.0	588	0.7921	0.9062
0.0001	148.0	592	0.7922	0.9062
0.0001	149.0	596	0.7922	0.9062
0.0001	150.0	600	0.7922	0.9062

Framework versions

Transformers 4.32.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.4.0
Tokenizers 0.13.3

simonycl
/

bert-large-uncased-sst-2-64-13

bert-large-uncased-sst-2-64-13

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for simonycl/bert-large-uncased-sst-2-64-13

Evaluation results