junsashihara/Qwen2.5-7B-Instruct-preference

Qwen2.5-7B-Instruct-preference

Model Description

Qwen2.5-7B-Instruct-preference is a fine-tuned model based on Qwen/Qwen2.5-7B-Instruct. This model is fine-tuned on original dataset. The fine-tuned were carried out at a 1024 context length.

Benchmarking

The benchmark score is obtained using arena-hard-auto-multilingual.

Qwen2.5-7B-Instruct	Ours
50.0	56.6

Model Details

Model size: 7B
Context length: 1024
Language: Japanese

Training Procudure

learning_rate: 5e-6
train_batch_size: 4
eval_batch_size: 2
gradient_accumulation_steps: 4
lr_scheduler_type: cosine

Training Results

Step	Traning Loss	Validation Loss
10	0.678400	0.665870
20	0.608500	0.638361
30	0.577300	0.607468
40	0.526700	0.559432
50	0.489200	0.523419
60	0.502800	0.511645
70	0.462300	0.506989
80	0.419600	0.509142
90	0.445200	0.510396
100	0.424400	0.511653

junsashihara
/

Qwen2.5-7B-Instruct-preference

Qwen2.5-7B-Instruct-preference

Model Description

Benchmarking

Model Details

Training Procudure

Training Results

Model tree for junsashihara/Qwen2.5-7B-Instruct-preference

Collection including junsashihara/Qwen2.5-7B-Instruct-preference

Qwen