swe_30k_v2_tag5

This model is a fine-tuned version of Qwen/Qwen2.5-Coder-7B-Instruct on the swe_30k_v2_tag5 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4522

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 4
  • total_eval_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
0.6812 0.0524 100 0.5303
0.664 0.1049 200 0.5222
0.6851 0.1573 300 0.5144
0.6637 0.2098 400 0.5085
0.5823 0.2622 500 0.4992
0.6342 0.3146 600 0.4874
0.5819 0.3671 700 0.4845
0.5393 0.4195 800 0.4796
0.7043 0.4719 900 0.4728
0.4485 0.5244 1000 0.4708
0.606 0.5768 1100 0.4642
0.521 0.6293 1200 0.4612
0.542 0.6817 1300 0.4597
0.5452 0.7341 1400 0.4562
0.5425 0.7866 1500 0.4558
0.5805 0.8390 1600 0.4525
0.5275 0.8915 1700 0.4524
0.5267 0.9439 1800 0.4526
0.5343 0.9963 1900 0.4521

Framework versions

  • Transformers 4.46.1
  • Pytorch 2.6.0+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
4
Safetensors
Model size
7.62B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for lemonhat/Qwen2.5-Coder-7B-Instruct-swe_30k_v2_tag5

Base model

Qwen/Qwen2.5-7B
Finetuned
(210)
this model