metadata

base_model: openai/whisper-small
language:
  - ar
license: apache-2.0
metrics:
  - wer
tags:
  - generated_from_trainer
model-index:
  - name: Tunisian Checkpoint7
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: custom_tunisian_dataset
          type: dataset
          args: 'config: ar, split: test'
        metrics:
          - type: wer
            value: 53.46233807772269
            name: Wer
          - type: cer
            value: 25.129550050556116
            name: Cer

Model Card for Model ID

Model Details

Model Description

Model Card for Model ID

Finetuning Whisper on Tunisian custom dataset

Model Details

Model Description

This model is a fine-tuned version of openai/whisper-small on the tunisian_custom dataset =4h(/doumawT02+dataset1+dataset2). It achieves the following results on the evaluation set:

Train Loss: 0.1355
Evaluation Loss: 1.1025073528289795
Wer: 53.46233807772269
Cer: 25.129550050556116
Developed by: [Ameni Khabthani]
Funded by [optional]: [More Information Needed]
Shared by [optional]: [More Information Needed]
Model type: [ASR system]
Language(s) (NLP): [More Information Needed]
License: [More Information Needed]
Finetuned from model [optional]: [whisper small]

Model Sources [optional]

Repository: [More Information Needed]
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]

Training Hyperparameters

per_device_train_batch_size=8
gradient_accumulation_steps=8
learning_rate= 1e-5
optimizer: { "type": Adam "betas": (0.9, 0.999) "epsilon": 1e-8 warmup_steps=100
max_steps=4000
gradient_checkpointing=True
fp16=True
save_steps=500
eval_steps=500
per_device_eval_batch_size=8
predict_with_generate=True
generation_max_length=251
lr_scheduler_type=inear lr_scheduler_warmup_steps=500 training_steps= 4000 mixed_precision_training=Native AMP logging_steps=50
weight_decay=0.01 dropout=0.1 seed=42 save_total_limit=5

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]