You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

whisper-swedish-media

whisper-swedish-media is a fine-tuned version of OpenAI's whisper-small, optimized for transcribing spoken Swedish from YouTube, TV programs, podcasts, and general media.

...

whisper-swedish-media

whisper-swedish-media is a fine-tuned version of OpenAI's whisper-small, optimized for transcribing spoken Swedish from YouTube, TV programs, podcasts, and general media.


πŸ” Model Overview

  • Base: openai/whisper-small
  • Language: Swedish (sv)
  • Domain: Broadcast media (TV, YouTube, podcasts)
  • Training Data: ~70 hours of manually transcribed audio using long-form conventions
  • Special Tags: <overlap>, <lang:English>, #eh, ((word)), etc.
  • Sampling Rate: 16 kHz

πŸ“Š Evaluation

On a holdout set of 300 media segments:

Model Checkpoint WER
Base (8kHz model) 1.234
Epoch 1 (Media) 0.376
Epoch 2 (Media) 0.543
Epoch 3 (Media) βœ… Best 0.205

πŸ” Access & Licensing

This model is gated. For commercial use or full access:

To request access, click β€œAccess repository” above and include a short message about your intended use. Access is reviewed manually within 24–48h.


πŸ“Œ Intended Use

This model is built for:

  • Media transcription (TV, YouTube, news)
  • Swedish NLP benchmarking
  • Broadcast monitoring and compliance tools

πŸ›’ Commercial Access

This model is available via AWS Marketplace

Subscribe to unlock GPU-accelerated transcription.

AWS Marketplace


πŸš€ Usage Example

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import soundfile as sf

processor = WhisperProcessor.from_pretrained("WMRNORDIC/whisper-swedish-media")
model = WhisperForConditionalGeneration.from_pretrained("WMRNORDIC/whisper-swedish-media")

audio, rate = sf.read("your_file.wav")
inputs = processor(audio, sampling_rate=rate, return_tensors="pt")
generated_ids = model.generate(inputs.input_features)
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

print("Transcription:", transcription)
Downloads last month
24
Safetensors
Model size
242M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for WMRNORDIC/whisper-swedish-media

Finetuned
(2742)
this model