---
language:
- ja
pipeline_tag: automatic-speech-recognition
---
# This is a faster whisper conversion of this model [efwkjn/whisper-ja-anime-v0.3](https://huggingface.co/efwkjn/whisper-ja-anime-v0.3)

For usage instructions follow [openai/whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo).
Note for faster-whisper vocab changes make model.is_multilingual and suppress_tokens wrong. Please adjust the code as required if you want to use this with faster-whisper.

Turbo finetune with japanese tokenizer. Full finetune trained 2^19 steps, batch size 64. Smaller vocab with ~1.6x bytes/token allows faster speed with 4 layers vs 2 layer distil (10% larger decoder).

[Benchmarks](BENCH.md). Short form slightly behind v0.2 (trained less?) but long form much better. Also trained for lyrics but untested.

# Acknowledgements

* Train sets: OOPPEENN, Reazon, 小虫哥_, Common Voice 20, deepghs
* Test sets: KitsuneX07, TEDxJP, kotoba-tech, Saruwatari-lab, grider-withourai

Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC)