Built with Axolotl

c52121a4-7150-47e8-86dc-07d77b96952d

This model is a fine-tuned version of VAGOsolutions/Llama-3.1-SauerkrautLM-8b-Instruct on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2849

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.000206
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 60
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 50
  • training_steps: 500

Training results

Training Loss Epoch Step Validation Loss
No log 0.0002 1 1.7801
1.501 0.0088 50 1.5433
1.6061 0.0176 100 1.5132
1.6193 0.0264 150 1.4998
1.4177 0.0352 200 1.4986
1.4498 0.0440 250 1.4235
1.4575 0.0528 300 1.3637
1.2726 0.0616 350 1.3191
1.3316 0.0704 400 1.2940
1.3561 0.0792 450 1.2850
1.3936 0.0881 500 1.2849

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.0
  • Pytorch 2.5.0+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for lesso06/c52121a4-7150-47e8-86dc-07d77b96952d

Adapter
(272)
this model