zephyr-7b-ultra-p-0.07

This model is a fine-tuned version of alignment-handbook/zephyr-7b-sft-full on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4875
  • Rewards/chosen: -0.5244
  • Rewards/rejected: -1.7320
  • Rewards/accuracies: 0.7109
  • Rewards/margins: 1.2075
  • Logps/rejected: -264.7986
  • Logps/chosen: -235.2518
  • Logits/rejected: -2.5806
  • Logits/chosen: -2.6463

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-07
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • total_eval_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.546 0.1030 100 0.5298 -0.2496 -1.0248 0.6875 0.7752 -257.7271 -232.5039 -2.5880 -2.6521
0.5221 0.2060 200 0.5055 -0.4087 -1.6295 0.6953 1.2208 -263.7740 -234.0950 -2.5117 -2.5716
0.5224 0.3090 300 0.4941 -0.4567 -1.5733 0.7031 1.1166 -263.2119 -234.5742 -2.5233 -2.5891
0.4907 0.4120 400 0.4953 -0.5910 -1.7671 0.75 1.1761 -265.1498 -235.9173 -2.5826 -2.6460
0.4876 0.5150 500 0.4960 -0.3941 -1.5280 0.6953 1.1339 -262.7591 -233.9482 -2.5836 -2.6465
0.4822 0.6180 600 0.4927 -0.3062 -1.5501 0.7109 1.2439 -262.9799 -233.0692 -2.5521 -2.6147
0.5079 0.7210 700 0.4948 -0.4099 -1.5277 0.7109 1.1178 -262.7565 -234.1069 -2.5957 -2.6577
0.5052 0.8240 800 0.4936 -0.4967 -1.7099 0.6953 1.2132 -264.5780 -234.9745 -2.5953 -2.6602
0.4691 0.9270 900 0.4879 -0.4878 -1.6605 0.7188 1.1727 -264.0843 -234.8857 -2.5774 -2.6430

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.0
  • Tokenizers 0.20.0
Downloads last month
19
Safetensors
Model size
7.24B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for tongliuphysics/zephyr-7b-ultra-p-0.07

Finetuned
(392)
this model