Discussion-Phi-4-multimodal-instruct-audio-dimp-only
This model is a fine-tuned version of microsoft/Phi-4-multimodal-instruct on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 77.3568
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 4e-05
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.95) and epsilon=1e-07 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 50
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
537668.6875 | 0.2235 | 10 | 3644.7732 |
0.2122 | 0.4469 | 20 | 15.8183 |
0.1827 | 0.6704 | 30 | 21.3162 |
0.044 | 0.8939 | 40 | 17.6673 |
0.0932 | 1.1117 | 50 | 19.0555 |
0.0639 | 1.3352 | 60 | 16.6945 |
0.041 | 1.5587 | 70 | 20.2311 |
0.0795 | 1.7821 | 80 | 26.3871 |
0.0125 | 2.0 | 90 | 20.4623 |
0.0148 | 2.2235 | 100 | 21.1870 |
0.0128 | 2.4469 | 110 | 36.9605 |
0.0055 | 2.6704 | 120 | 63.9226 |
0.0037 | 2.8939 | 130 | 77.3568 |
Framework versions
- Transformers 4.48.2
- Pytorch 2.4.1+cu124
- Datasets 3.6.0
- Tokenizers 0.21.1
- Downloads last month
- 19
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for TakalaWang/Discussion-Phi-4-multimodal-instruct-audio-dimp-only
Base model
microsoft/Phi-4-multimodal-instruct