Whisper Small finetuned for Turkmen

This is a finetuned version of openai/whisper-small for Automatic Speech Recognition (ASR) in the Turkmen language (tk). The model was trained on the Turkmen subset of the Mozilla Common Voice 17.0 dataset.

This model card was created to provide a clear guide on how to use the model, its intended applications, and its training details.

Model Description

Base Model: openai/whisper-small
Language: Turkmen (tk)
Dataset: Mozilla Common Voice 17.0
Fine-tuned by: Atamyrat2005

The model is designed to transcribe Turkmen speech to text. By finetuning whisper-small, this model is adapted to the specific phonetics, vocabulary, and grammar of the Turkmen language, offering improved performance over the base multilingual model for this specific task.

Intended Use & Limitations

This model is intended for transcribing Turkmen speech. It can be used in various applications, such as:

Transcribing audio or video files.
Building voice-controlled applications for Turkmen speakers.
Assisting in linguistic research for the Turkmen language.

Limitations:

The model's performance is highly dependent on the quality of the input audio. It may perform poorly on very noisy audio or audio with strong background music.
The training data comes from the Common Voice dataset, which primarily consists of read speech. The model may not generalize perfectly to spontaneous, conversational, or highly dialectal speech.
As it is based on whisper-small, it might be less accurate than larger Whisper models (e.g., whisper-medium or whisper-large) but offers a good balance between performance and computational cost.

How to Use

You can use this model with the transformers library pipeline for a straightforward inference experience.

1. Installation

First, make sure you have the necessary libraries installed.

pip install --upgrade transformers torch datasets

Atamyrat2005
/

whisper-base-tk-finetuned

Whisper Small finetuned for Turkmen

Model Description

Intended Use & Limitations

How to Use

1. Installation

Model tree for Atamyrat2005/whisper-base-tk-finetuned

Dataset used to train Atamyrat2005/whisper-base-tk-finetuned