Safetensors
whisper

Malaysian Finetune Whisper Large V3 Turbo

Finetune Whisper Large V3 Turbo on Malaysian context.

Improvement

  1. Distilled from Whisper Large V3 on Malaysian and Science context.
  2. Better translation for Malay, Manglish, Mandarin, Tamil and Science context.
  3. Word level timestamp, introduced <|transcribeprecise|> token, a new task!

how we finetuned it?

We done 2 phases,

  1. Finetune on mesolitica/Malaysian-STT-Whisper
  1. Annealing on 5% from mesolitica/Malaysian-STT-Whisper and 100% from mesolitica/Malaysian-STT-Whisper-Stage2
Downloads last month
3,582
Safetensors
Model size
809M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mesolitica/malaysian-whisper-large-v3-turbo-v3

Finetuned
(225)
this model

Datasets used to train mesolitica/malaysian-whisper-large-v3-turbo-v3