wav2vec2-base-960h-musiccaps

This model is a fine-tuned version of facebook/wav2vec2-base-960h on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.9625
  • Exact Accuracy: 0.0
  • Partial Accuracy: 0.6056

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Exact Accuracy Partial Accuracy
15.7112 0.9 9 15.9578 0.06 0.7733
11.1514 1.8 18 10.0587 0.04 0.7667
6.6287 2.7 27 6.6632 0.0 0.5911
4.2256 3.6 36 1.9485 0.0 0.3878
5.6467 4.5 45 4.0850 0.0 0.4422
5.7963 5.4 54 3.0922 0.0 0.5133
4.2273 6.3 63 4.0553 0.0 0.5933
4.0305 7.2 72 3.7869 0.0 0.4600
3.1243 8.1 81 2.4867 0.21 0.8244
2.1803 9.0 90 1.9611 0.0 0.6822
4.1344 9.9 99 2.0692 0.0 0.4978
2.4153 10.8 108 2.4877 0.0 0.4756
2.558 11.7 117 3.8951 0.0 0.46
3.2152 12.6 126 2.9583 0.0 0.4689
2.7496 13.5 135 2.0802 0.0 0.6833
3.3934 14.4 144 4.6528 0.0 0.5344
3.7926 15.3 153 2.9359 0.0 0.7156
3.7156 16.2 162 1.8099 0.0 0.5856
3.0527 17.1 171 2.1170 0.0 0.5956
3.0344 18.0 180 2.8589 0.0 0.6544
3.3441 18.9 189 2.8075 0.0 0.6089
2.553 19.8 198 2.0358 0.03 0.7456
3.7821 20.7 207 3.4526 0.0 0.5911
3.4076 21.6 216 2.6764 0.0 0.5733
3.1877 22.5 225 1.6248 0.0 0.5678
3.4089 23.4 234 4.6811 0.03 0.7167
4.6228 24.3 243 3.3227 0.0 0.5044
3.5935 25.2 252 3.7636 0.0 0.5267
4.5658 26.1 261 5.2628 0.0 0.6489
4.9328 27.0 270 4.7943 0.0 0.6267
5.6449 27.9 279 4.3986 0.0 0.6589
4.6314 28.8 288 3.1873 0.02 0.74
5.4069 29.7 297 3.5047 0.06 0.7678
5.4739 30.6 306 8.9623 0.0 0.4611
4.9211 31.5 315 4.1060 0.0 0.5644
5.1984 32.4 324 4.0939 0.0 0.4478
4.3039 33.3 333 2.6682 0.0 0.5267
3.5481 34.2 342 1.9895 0.01 0.7156
2.1797 35.1 351 2.6432 0.0 0.6033
2.0572 36.0 360 2.0274 0.0 0.5733
2.4059 36.9 369 1.1969 0.11 0.7989
3.4561 37.8 378 4.1893 0.06 0.7944
2.7969 38.7 387 2.3889 0.0 0.6222
2.3617 39.6 396 2.9330 0.0 0.5333
3.0501 40.5 405 2.8672 0.0 0.6322
3.1292 41.4 414 4.0559 0.0 0.5278
2.95 42.3 423 2.4140 0.0 0.6989
2.8121 43.2 432 3.9192 0.0 0.4622
4.0374 44.1 441 4.3913 0.0 0.6322
3.4575 45.0 450 2.8873 0.0 0.5878
2.8763 45.9 459 6.7531 0.0 0.4956
5.0513 46.8 468 5.0384 0.04 0.7544
4.0507 47.7 477 3.3518 0.0 0.5600
2.479 48.6 486 4.3941 0.0 0.5356
3.752 49.5 495 3.9625 0.0 0.6056

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
5
Safetensors
Model size
95M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for kurosekurose/wav2vec2-base-960h-musiccaps

Finetuned
(150)
this model