TOC:
faster-whisper-base-ar-quran
This model is a CTranslate2 version of tarteel-ai/whisper-base-ar-quran. The conversion was performed using the following command:
# Don't add vocab.json and config.json , as they already get created while converting to new model
ct2-transformers-converter --model tarteel-ai/whisper-base-ar-quran --force --output_dir "path/to/output/dir/faster-whisper-base-ar-quran" --quantization float16 --copy_files added_tokens.json normalizer.json preprocessor_config.json special_tokens_map.json tokenizer_config.json
To use the ct2-transformers-converter
command, you'll need to install the required dependencies:
pip install transformers[torch]
For more information about the converter, see the CTranslate2 documentation or the "CTranslate2 Installation" section below
This conversion transforms the model from OpenAI's vanilla Whisper family to the faster-whisper family, making it compatible with WhisperX which utilizes faster-whisper versions for improved performance.
- For reference: "WhisperX enables significantly faster transcription speeds - up to 70x realtime with large-v2 models while requiring less than 8GB GPU memory."
Usage Example
Follow the Python Usage ๐
section in WhisperX's README page here, but change this line:
model = whisperx.load_model("large-v2", device, compute_type=compute_type)
to this line:
model = whisperx.load_model("OdyAsh/faster-whisper-base-ar-quran", device, compute_type=compute_type)
Another usage example: specific parts of a repo (called surah-splitter) here and here.
Quantization Side Note
You'll see that in the command above, we're using --quantization float16
. However, the original tarteel-ai/whisper-base-ar-quran
is in float32 precision (source). Yet, float16 conversion was kept for the following reasons:
- Reduced size: 141mb instead of 290mb.
- Negligible performance impact for the usecase in mind: When this model was tested on sample Quran audio files, there were few transcription errors that didn't affect the overall performance of OdyAsh's whisperx solution which used this model, since the subsequent steps in that solution's pipeline (e.g., alignment, reference <-> input matching DP algorithm, etc.) still yielded accurate results.
However, if the transcription results are not satisfactory for your use case, you can always convert the model to float32 precision by:
- Changing the
--quantization
argument tofloat32
in the command above to get a larger model size (around 290mb). - Or, during inference runtime, you can set the
compute_type
argument of whisperx.load_model() tofloat32
.
CTranslate2 Installation
In the OdyAsh/faster-whisper-base-ar-quran GitHub repo, you'll see pyproject.toml
and uv.lock
files, indicating that you can use uv
to install the required packages for the ct2-transformers-converter
command instead of pip (if you want).
Steps:
Install
uv
if not already installed by following this section in their docs.In your terminal, navigate to your local directory in which you cloned the OdyAsh/faster-whisper-base-ar-quran GitHub repo.
Install the required packages right away (since
uv.lock
file is already present in that local directory):
uv install
- Verify installation:
ct2-transformers-converter --help
- Downloads last month
- 90
Model tree for OdyAsh/faster-whisper-base-ar-quran
Base model
openai/whisper-base