llama-fin

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2086

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss
3.0634 0.1593 5000 1.6380
1.5345 0.3185 10000 1.4842
1.4255 0.4778 15000 1.4151
1.3929 0.6370 20000 1.3720
1.3462 0.7963 25000 1.3367
1.3094 0.9555 30000 1.3087
1.2835 1.1148 35000 1.2838
1.2534 1.2740 40000 1.2605
1.2303 1.4333 45000 1.2407
1.2187 1.5926 50000 1.2244
1.2001 1.7518 55000 1.2133
1.1937 1.9111 60000 1.2086

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.1.0+cu118
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
123
Safetensors
Model size
27.9M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Jae-star/llama-fin

Finetunes
1 model