For usage instructions follow openai/whisper-large-v3-turbo. Note for faster-whisper vocab changes make model.is_multilingual and suppress_tokens wrong. Please adjust the code as required if you want to use this with faster-whisper.

Turbo finetune with japanese tokenizer. Full finetune trained 2^19 steps, batch size 64. Smaller vocab with ~1.6x bytes/token allows faster speed with 4 layers vs 2 layer distil (10% larger decoder).

Benchmarks. Short form slightly behind v0.2 (trained less?) but long form much better. Also trained for lyrics but untested.

Acknowledgements

  • Train sets: OOPPEENN, Reazon, ๅฐ่™ซๅ“ฅ_, Common Voice 20, deepghs
  • Test sets: KitsuneX07, TEDxJP, kotoba-tech, Saruwatari-lab, grider-withourai

Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC)

Downloads last month
957
Safetensors
Model size
769M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for efwkjn/whisper-ja-anime-v0.3

Adapters
1 model

Space using efwkjn/whisper-ja-anime-v0.3 1