---
model-index:
- name: artificialguybr/whisper-small-pt-cv13
  results:
  - task:
      type: automatic-speech-recognition
      name: Automatic Speech Recognition
    dataset:
      name: mozilla-foundation/common_voice_13_0
      type: mozilla-foundation/common_voice_13_0
      config: pt
      split: test
    metrics:
    - type: wer
      value: 10.30
      name: WER
---

# Model Card for Model ID

This is a finetune for Whisper Small. A finetune to achieve better results on Whisper Small for Portuguese. 
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation.


## Model Details

Whisper is a Transformer based encoder-decoder model, also referred to as a _sequence-to-sequence_ model. 
It was trained on 680k hours of labelled speech data annotated using large-scale weak supervision. 

This is a finetune using Common Voice 13.0 to improve the results for PORTUGUESE.

- **Developed by:** [ArtificialGuyBr](https://twitter.com/@artificialguybr)
- **Shared by:** [ArtificialGuyBr](https://twitter.com/@artificialguybr)

## Uses

This repository contains a fine-tuned version of the Whisper ASR (Automatic Speech Recognition) system developed by OpenAI. The model has been specifically fine-tuned to improve performance in portuguese language.

### Out-of-Scope Use

While this model is powerful and versatile, it's important to understand its limitations and inappropriate uses:

1. **Misuse and Malicious Use**: This model should not be used for any illegal activities, including but not limited to eavesdropping, illegal surveillance, or any other form of privacy invasion. It's also not intended for the creation or spread of misinformation, hate speech, or harmful content.

2. **Non-Portuguese Languages**: While this model has been fine-tuned for Portuguese, it may not perform well with other languages. It's not recommended for transcribing multilingual content where languages other than Portuguese are spoken.

3. **Low-Quality Audio**: The model's performance can be significantly affected by the quality of the input audio. It may not work well with low-quality audio, background noise, or speakers who are far away from the microphone.

## Training Details

### Training Procedure 

Trained using the code from HF Whisper Event.

#### Training Hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 64
- eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- training_steps: 5000


---
### 🌐 Website
You can find more of my models, projects, and information on my official website:
- **[artificialguy.com](https://artificialguy.com/)**

### 💖 Support My Work
If you find this model useful, please consider supporting my work. It helps me cover server costs and dedicate more time to new open-source projects.
- **Patreon:** [Support on Patreon](https://www.patreon.com/user?u=81570187)
- **Ko-fi:** [Buy me a Ko-fi](https://ko-fi.com/artificialguybr)
- **Buy Me a Coffee:** [Buy me a Coffee](https://buymeacoffee.com/jvkape)
## Evaluation

Wer on CV13.0: 10.3


- **Hardware Type:** 1XA100 80GB
- **Hours used:** 8 Hours.
- **Cloud Provider:** [Redmond.ai](https://redmond.ai)