brevity_purpose-promt_e30

This model is a fine-tuned version of TheBloke/Mistral-7B-Instruct-v0.2-GPTQ on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.5203

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
3.7055	0.8	2	2.9958
2.2541	2.0	5	2.4763
2.8514	2.8	7	2.1730
1.6487	4.0	10	1.8340
2.1422	4.8	12	1.6479
1.2389	6.0	15	1.3906
1.5708	6.8	17	1.2323
0.8859	8.0	20	1.0252
1.1093	8.8	22	0.9102
0.6324	10.0	25	0.7863
0.8314	10.8	27	0.7299
0.5064	12.0	30	0.6678
0.6959	12.8	32	0.6350
0.4274	14.0	35	0.5991
0.5989	14.8	37	0.5832
0.3782	16.0	40	0.5616
0.542	16.8	42	0.5521
0.3476	18.0	45	0.5434
0.5033	18.8	47	0.5399
0.3251	20.0	50	0.5345
0.4739	20.8	52	0.5289
0.3091	22.0	55	0.5224
0.4552	22.8	57	0.5206
0.3005	24.0	60	0.5203

Framework versions

PEFT 0.10.0
Transformers 4.40.2
Pytorch 2.1.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

Spruteus
/

brevity_purpose-promt_e30

brevity_purpose-promt_e30

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Spruteus/brevity_purpose-promt_e30

Evaluation results