Whisper Small finetuned for Turkmen

This is a finetuned version of openai/whisper-small for Automatic Speech Recognition (ASR) in the Turkmen language (tk). The model was trained on the Turkmen subset of the Mozilla Common Voice 17.0 dataset.

This model card was created to provide a clear guide on how to use the model, its intended applications, and its training details.

Model Description

  • Base Model: openai/whisper-small
  • Language: Turkmen (tk)
  • Dataset: Mozilla Common Voice 17.0
  • Fine-tuned by: Atamyrat2005

The model is designed to transcribe Turkmen speech to text. By finetuning whisper-small, this model is adapted to the specific phonetics, vocabulary, and grammar of the Turkmen language, offering improved performance over the base multilingual model for this specific task.

Intended Use & Limitations

This model is intended for transcribing Turkmen speech. It can be used in various applications, such as:

  • Transcribing audio or video files.
  • Building voice-controlled applications for Turkmen speakers.
  • Assisting in linguistic research for the Turkmen language.

Limitations:

  • The model's performance is highly dependent on the quality of the input audio. It may perform poorly on very noisy audio or audio with strong background music.
  • The training data comes from the Common Voice dataset, which primarily consists of read speech. The model may not generalize perfectly to spontaneous, conversational, or highly dialectal speech.
  • As it is based on whisper-small, it might be less accurate than larger Whisper models (e.g., whisper-medium or whisper-large) but offers a good balance between performance and computational cost.

How to Use

You can use this model with the transformers library pipeline for a straightforward inference experience.

1. Installation

First, make sure you have the necessary libraries installed.

pip install --upgrade transformers torch datasets
Downloads last month
19
Safetensors
Model size
72.6M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Atamyrat2005/whisper-base-tk-finetuned

Finetuned
(2826)
this model

Dataset used to train Atamyrat2005/whisper-base-tk-finetuned