distil_low_lr

This model is a fine-tuned version of distilbert/distilbert-base-uncased on the generator dataset. It achieves the following results on the evaluation set:

Loss: 0.8611
Accuracy: 0.745

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 64
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	25	0.6622	0.5975
No log	2.0	50	0.6039	0.67
No log	3.0	75	0.5630	0.71
No log	4.0	100	0.5624	0.7075
No log	5.0	125	0.5749	0.7325
No log	6.0	150	0.5818	0.73
No log	7.0	175	0.6017	0.735
No log	8.0	200	0.6407	0.735
No log	9.0	225	0.6702	0.74
No log	10.0	250	0.6999	0.74
No log	11.0	275	0.7251	0.7325
No log	12.0	300	0.7472	0.735
No log	13.0	325	0.7647	0.745
No log	14.0	350	0.7962	0.74
No log	15.0	375	0.8120	0.7325
No log	16.0	400	0.8283	0.745
No log	17.0	425	0.8366	0.7425
No log	18.0	450	0.8481	0.745
No log	19.0	475	0.8588	0.7425
0.2302	20.0	500	0.8611	0.745

Framework versions

Transformers 4.38.1
Pytorch 2.1.0+cu121
Datasets 2.18.0
Tokenizers 0.15.2

philip1231
/

distil_low_lr

distil_low_lr

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for philip1231/distil_low_lr

Evaluation results