mistral-7b-detox

This model is a fine-tuned version of mistralai/Mistral-7B-v0.3 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6476

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 3
  • total_train_batch_size: 48
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 30
  • training_steps: 700

Training results

Training Loss Epoch Step Validation Loss
0.9783 0.0714 50 0.8158
0.3603 0.1429 100 0.5411
0.3077 0.2143 150 0.4969
0.2214 0.2857 200 0.5404
0.1777 0.3571 250 0.5685
0.1138 0.4286 300 0.5604
0.1029 0.5 350 0.6015
0.0766 0.5714 400 0.5587
0.0631 0.6429 450 0.5737
0.0362 0.7143 500 0.5656
0.0374 0.7857 550 0.6343
0.033 0.8571 600 0.6397
0.0338 0.9286 650 0.6391
0.0323 1.0 700 0.6476

Framework versions

  • Transformers 4.44.0.dev0
  • Pytorch 2.1.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
266k params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for Kamyar-zeinalipour/mistral-7b-detox

Finetuned
(150)
this model