zephyr-7b-ultra-p-0.04

This model is a fine-tuned version of alignment-handbook/zephyr-7b-sft-full on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5016
  • Rewards/chosen: -0.5725
  • Rewards/rejected: -2.2075
  • Rewards/accuracies: 0.7188
  • Rewards/margins: 1.6350
  • Logps/rejected: -269.5538
  • Logps/chosen: -235.7328
  • Logits/rejected: -2.5465
  • Logits/chosen: -2.6177

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-07
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • total_eval_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.569 0.1030 100 0.5464 -0.2783 -0.9172 0.6953 0.6389 -256.6510 -232.7902 -2.5836 -2.6494
0.5376 0.2060 200 0.5246 -0.5656 -1.4359 0.7031 0.8703 -261.8382 -235.6634 -2.4669 -2.5324
0.53 0.3090 300 0.5364 -0.6183 -1.3601 0.6875 0.7418 -261.0804 -236.1908 -2.4854 -2.5577
0.5092 0.4120 400 0.5188 -0.6586 -2.1894 0.7266 1.5308 -269.3731 -236.5937 -2.5850 -2.6537
0.5039 0.5150 500 0.5133 -0.4455 -1.8944 0.7109 1.4490 -266.4232 -234.4622 -2.5959 -2.6648
0.5018 0.6180 600 0.5124 -0.3893 -1.9586 0.7266 1.5693 -267.0651 -233.9008 -2.5504 -2.6196
0.5162 0.7210 700 0.5112 -0.4435 -1.9493 0.7188 1.5058 -266.9722 -234.4430 -2.5634 -2.6316
0.5264 0.8240 800 0.5078 -0.5335 -2.2073 0.7344 1.6738 -269.5521 -235.3425 -2.5636 -2.6340
0.4775 0.9270 900 0.5023 -0.5305 -2.1520 0.7266 1.6216 -268.9995 -235.3122 -2.5423 -2.6135

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.0
  • Tokenizers 0.20.0
Downloads last month
16
Safetensors
Model size
7.24B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for tongliuphysics/zephyr-7b-ultra-p-0.04

Finetuned
(392)
this model