deit-ena24

This model is a fine-tuned version of facebook/deit-base-distilled-patch16-224 on the ena24 dataset. It achieves the following results on the evaluation set:

Loss: 0.1233
Accuracy: 0.9809
F1: 0.9799

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 7
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
1.396	0.1302	100	1.0114	0.7107	0.6602
1.0428	0.2604	200	0.7400	0.7939	0.7694
0.6952	0.3906	300	0.6129	0.8160	0.7981
0.4429	0.5208	400	0.4991	0.8618	0.8171
0.5441	0.6510	500	0.4392	0.8840	0.8631
0.4533	0.7812	600	0.4120	0.8985	0.8765
0.1213	0.9115	700	0.3953	0.8916	0.8738
0.1151	1.0417	800	0.3146	0.9237	0.9141
0.0953	1.1719	900	0.4656	0.9015	0.8786
0.1876	1.3021	1000	0.3164	0.9168	0.9023
0.2368	1.4323	1100	0.2997	0.9305	0.9219
0.0658	1.5625	1200	0.2324	0.9534	0.9473
0.0566	1.6927	1300	0.3444	0.9176	0.9077
0.2437	1.8229	1400	0.3033	0.9435	0.9363
0.1011	1.9531	1500	0.2740	0.9450	0.9330
0.2987	2.0833	1600	0.2715	0.9489	0.9419
0.0227	2.2135	1700	0.2050	0.9603	0.9562
0.1891	2.3438	1800	0.2055	0.9542	0.9494
0.0325	2.4740	1900	0.2070	0.9626	0.9604
0.0407	2.6042	2000	0.1876	0.9611	0.9550
0.0112	2.7344	2100	0.1702	0.9748	0.9719
0.112	2.8646	2200	0.1695	0.9656	0.9624
0.184	2.9948	2300	0.2088	0.9626	0.9590
0.0464	3.125	2400	0.1805	0.9656	0.9613
0.0794	3.2552	2500	0.2089	0.9634	0.9608
0.0033	3.3854	2600	0.2128	0.9603	0.9623
0.0422	3.5156	2700	0.1378	0.9702	0.9701
0.2038	3.6458	2800	0.1674	0.9687	0.9685
0.0156	3.7760	2900	0.1383	0.9756	0.9758
0.0004	3.9062	3000	0.1544	0.9733	0.9715
0.002	4.0365	3100	0.1552	0.9710	0.9690
0.0405	4.1667	3200	0.1326	0.9763	0.9751
0.0031	4.2969	3300	0.1437	0.9756	0.9759
0.0022	4.4271	3400	0.1316	0.9794	0.9790
0.0019	4.5573	3500	0.1233	0.9809	0.9799
0.0005	4.6875	3600	0.1400	0.9771	0.9763
0.0002	4.8177	3700	0.1339	0.9794	0.9797
0.0304	4.9479	3800	0.1469	0.9794	0.9795
0.0	5.0781	3900	0.1532	0.9763	0.9760
0.0	5.2083	4000	0.1530	0.9779	0.9771
0.0006	5.3385	4100	0.1434	0.9771	0.9765
0.0218	5.4688	4200	0.1468	0.9763	0.9751
0.0043	5.5990	4300	0.1568	0.9779	0.9763
0.0246	5.7292	4400	0.1582	0.9771	0.9748
0.0052	5.8594	4500	0.1489	0.9786	0.9774
0.0003	5.9896	4600	0.1499	0.9779	0.9775
0.0	6.1198	4700	0.1457	0.9794	0.9786
0.0	6.25	4800	0.1437	0.9802	0.9794
0.0048	6.3802	4900	0.1440	0.9794	0.9782
0.0002	6.5104	5000	0.1417	0.9802	0.9793
0.0001	6.6406	5100	0.1427	0.9802	0.9794
0.0	6.7708	5200	0.1422	0.9802	0.9787
0.0002	6.9010	5300	0.1425	0.9802	0.9787

Framework versions

Transformers 4.52.3
Pytorch 2.7.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

mbiarreta
/

deit-ena24

deit-ena24

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for mbiarreta/deit-ena24

Evaluation results