train_2025-04-13-11-42-17

This model is a fine-tuned version of Qwen/Qwen2.5-Coder-3B on the hccsri dataset. It achieves the following results on the evaluation set:

  • Loss: 3.0146
  • Num Input Tokens Seen: 304528

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • num_epochs: 1.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
3.3828 0.0421 100 3.3995 12336
3.7301 0.0842 200 3.3119 25168
2.7913 0.1263 300 3.2942 38256
2.9739 0.1684 400 3.2371 51344
2.4265 0.2105 500 3.2291 64896
3.2421 0.2526 600 3.1979 77584
3.1406 0.2947 700 3.1769 89952
3.1426 0.3368 800 3.1570 102976
3.101 0.3789 900 3.1375 115536
3.3436 0.4211 1000 3.1259 129104
3.2059 0.4632 1100 3.0889 142176
3.2607 0.5053 1200 3.0811 155360
2.8431 0.5474 1300 3.0604 168080
3.3622 0.5895 1400 3.0450 181424
2.2921 0.6316 1500 3.0402 193696
3.1937 0.6737 1600 3.0319 206800
3.2635 0.7158 1700 3.0285 219920
2.9374 0.7579 1800 3.0246 232224
3.3592 0.8 1900 3.0196 244976
3.1163 0.8421 2000 3.0173 256912
2.8533 0.8842 2100 3.0168 270112
3.6021 0.9263 2200 3.0151 282512
3.2839 0.9684 2300 3.0146 295680

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.1
  • Pytorch 2.6.0+cu118
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
0
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for hccsri/hccsriAI

Base model

Qwen/Qwen2.5-3B
Adapter
(3)
this model