whisper-swedish-media
whisper-swedish-media
is a fine-tuned version of OpenAI's whisper-small
, optimized for transcribing spoken Swedish from YouTube, TV programs, podcasts, and general media.
...
whisper-swedish-media
whisper-swedish-media
is a fine-tuned version of OpenAI's whisper-small
, optimized for transcribing spoken Swedish from YouTube, TV programs, podcasts, and general media.
π Model Overview
- Base:
openai/whisper-small
- Language: Swedish (
sv
) - Domain: Broadcast media (TV, YouTube, podcasts)
- Training Data: ~70 hours of manually transcribed audio using long-form conventions
- Special Tags:
<overlap>
,<lang:English>
,#eh
,((word))
, etc. - Sampling Rate: 16 kHz
π Evaluation
On a holdout set of 300 media segments:
Model Checkpoint | WER |
---|---|
Base (8kHz model) | 1.234 |
Epoch 1 (Media) | 0.376 |
Epoch 2 (Media) | 0.543 |
Epoch 3 (Media) β Best | 0.205 |
π Access & Licensing
This model is gated. For commercial use or full access:
- π§ Contact: [email protected]
- π AWS Marketplace (WMRNORDIC)
To request access, click βAccess repositoryβ above and include a short message about your intended use. Access is reviewed manually within 24β48h.
π Intended Use
This model is built for:
- Media transcription (TV, YouTube, news)
- Swedish NLP benchmarking
- Broadcast monitoring and compliance tools
π Commercial Access
This model is available via AWS Marketplace
Subscribe to unlock GPU-accelerated transcription.
π Usage Example
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import soundfile as sf
processor = WhisperProcessor.from_pretrained("WMRNORDIC/whisper-swedish-media")
model = WhisperForConditionalGeneration.from_pretrained("WMRNORDIC/whisper-swedish-media")
audio, rate = sf.read("your_file.wav")
inputs = processor(audio, sampling_rate=rate, return_tensors="pt")
generated_ids = model.generate(inputs.input_features)
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print("Transcription:", transcription)
- Downloads last month
- 24
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for WMRNORDIC/whisper-swedish-media
Base model
openai/whisper-small