SebastianSchramm
/

tinyllama-1.1B-intermediate-step-715k-1.5T-dpo-lora-v4

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Metrics Training metrics Community

tinyllama-1.1B-intermediate-step-715k-1.5T-dpo-lora-v4

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.6904
Rewards/chosen: -3.5271
Rewards/rejected: -5.6475
Rewards/accuracies: 0.7393
Rewards/margins: 2.1205
Logps/rejected: -394.1334
Logps/chosen: -478.6117
Logits/rejected: -3.8937
Logits/chosen: -4.0184

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 2
eval_batch_size: 2
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 32
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.02
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss	Rewards/chosen	Rewards/rejected	Rewards/accuracies	Rewards/margins	Logps/rejected	Logps/chosen	Logits/rejected	Logits/chosen
0.5491	0.34	300	0.5719	-0.5176	-1.3357	0.7015	0.8181	-351.0149	-448.5167	-4.0592	-4.2257
0.5906	0.68	600	0.5625	-0.3365	-1.2779	0.7191	0.9414	-350.4370	-446.7061	-4.0731	-4.2239
0.2857	1.02	900	0.5723	-0.3882	-1.5979	0.7141	1.2097	-353.6368	-447.2226	-4.0753	-4.2332
0.2679	1.36	1200	0.5883	-1.1630	-2.3423	0.7234	1.1793	-361.0811	-454.9714	-4.0115	-4.1888
0.231	1.71	1500	0.5895	-1.3278	-2.7966	0.7338	1.4688	-365.6242	-456.6194	-4.0069	-4.1696
0.0862	2.05	1800	0.6626	-2.7764	-4.6708	0.7284	1.8944	-384.3661	-471.1047	-3.9624	-4.0992
0.0804	2.39	2100	0.6818	-3.0330	-5.1156	0.7410	2.0826	-388.8140	-473.6706	-3.9128	-4.0467
0.0925	2.73	2400	0.6947	-3.5621	-5.6537	0.7371	2.0916	-394.1956	-478.9623	-3.8908	-4.0137

Framework versions

Transformers 4.35.0
Pytorch 2.1.0+cu121
Datasets 2.14.6
Tokenizers 0.14.1

Downloads last month: 3

Evaluation results

Metadata error: specify a dataset to view leaderboard