Built with Axolotl

16e9f1fa-0db2-4723-a67b-8232bb72b388

This model is a fine-tuned version of VAGOsolutions/Llama-3.1-SauerkrautLM-8b-Instruct on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2820

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.000211
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 110
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 50
  • training_steps: 500

Training results

Training Loss Epoch Step Validation Loss
No log 0.0002 1 1.7827
1.4843 0.0088 50 1.5400
1.5669 0.0176 100 1.5711
1.4269 0.0264 150 1.5191
1.5171 0.0352 200 1.4625
1.4868 0.0440 250 1.4442
1.4091 0.0528 300 1.3749
1.4567 0.0616 350 1.3321
1.3792 0.0704 400 1.2947
1.3412 0.0792 450 1.2853
1.2482 0.0881 500 1.2820

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.0
  • Pytorch 2.5.0+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for lesso11/16e9f1fa-0db2-4723-a67b-8232bb72b388

Adapter
(272)
this model