--- library_name: transformers license: llama3 base_model: mrcuddle/Dark-Hermes3-Llama3.2-3B tags: - axolotl - generated_from_trainer datasets: - NousResearch/hermes-function-calling-v1 model-index: - name: Dark-Hermes3-Llama3.2-3B-Func results: [] --- [Built with Axolotl](https://github.com/axolotl-ai-cloud/axolotl)
See axolotl config axolotl version: `0.8.0.dev0` ```yaml base_model: mrcuddle/Dark-Hermes3-Llama3.2-3B hub_model_id: mrcuddle/Dark-Hermes3-Llama3.2-3B-Func dataloader_num_workers: 8 datasets: - chat_template: alpaca field_messages: conversations message_property_mappings: content: value role: from path: NousResearch/hermes-function-calling-v1 split: train type: chat_template eval_steps: 500 evaluation_strategy: steps fp16: true gradient_accumulation_steps: 4 gradient_checkpointing: true learning_rate: 2e-5 logging_dir: /content/outputs/logs logging_steps: 50 lr_scheduler: linear lr_scheduler_type: linear micro_batch_size: 2 num_train_epochs: 3 optimizer: adamw_torch # Or another optimizer of your choice output_dir: /content/outputs overwrite_output_dir: true per_device_train_batch_size: 8 save_steps: 500 save_total_limit: 2 use_peft: false val_set_size: 0.05 warmup_steps: 100 unsloth: true # Enable Unsloth if supported by your training framework ```

# Dark-Hermes3-Llama3.2-3B-Func This model is a fine-tuned version of [mrcuddle/Dark-Hermes3-Llama3.2-3B](https://huggingface.co/mrcuddle/Dark-Hermes3-Llama3.2-3B) on the NousResearch/hermes-function-calling-v1 dataset. ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 8 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 100 - num_epochs: 1.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | No log | 0.0889 | 1 | 0.3864 | ### Framework versions - Transformers 4.49.0 - Pytorch 2.5.1+cu124 - Datasets 3.2.0 - Tokenizers 0.21.0