How to?
#1
by
etemiz
- opened
Hi,
Thanks for the model. How do I use it ?
whisper.load_model() does not work..
Hi,
whisper.load_model()
only works with OpenAI's original models. For custom models like mine, use either:
Transformers (easiest):
from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="ysdede/whisper-khanacademy-large-v3-turbo-tr") pipe("audio.mp3")
Faster Inference (CT2 backend):
I recommend the optimizedct2
version for better speed:ysdede/whisper-khanacademy-large-v3-turbo-tr-ct2
Works withfaster-whisper
orwhisperx
for whisperx simply:
whisperx audio.mp3 --model ysdede/whisper-khanacademy-large-v3-turbo-tr-ct2
or usign a batch file:
@echo off
set "input_file=%~1"
set "output_file=%~dpn1.vtt"
set "output_dir=%~dp1"
if "%output_dir:~-1%"=="\" set "output_dir=%output_dir:~0,-1%"
echo Input file: %input_file%
echo Output file: %output_file%
echo Output directory: %output_dir%
whisperx.exe "%input_file%" --language tr --output_format vtt --compute_type int8 --model "ysdede/whisper-khanacademy-large-v3-turbo-tr-ct2" --segment_resolution sentence --verbose True --batch_size 1 --print_progress True --output_dir "%output_dir%"
Important Note:
This is not superior to vanilla large-v3 for general Turkish - it's specifically fine-tuned to recognize Khan Academy's teaching style (educational terms, lecture cadence). For everyday speech, OpenAI's original may perform better.