Spaces:
Running
on
CPU Upgrade
Please add State of the Art proprietary models to OpenASR leaderboard
Add Gemini 2.5 Pro
According to the release paper Gemini 2.5 pro could be SOTA
(https://huggingface.co/google)
Add Soniox
Please add Soniox https://soniox.com/
Documentation for Async Transcription
https://soniox.com/docs/speech-to-text/get-started/transcribe-file
Documentation for realtime transcription
https://soniox.com/docs/speech-to-text/get-started/transcribe-realtime
https://huggingface.co/soniox
Add Amazon Nova Sonic
Please add Amazon Nova Sonic https://aws.amazon.com/de/ai/generative-ai/nova/speech/
API Documentation
https://docs.aws.amazon.com/nova/latest/userguide/speech.html
Add OpenAI gpt4o-transcribe
Please add OpenAI Gpt4o-transcribe https://openai.com/index/introducing-our-next-generation-audio-models/
Here is the API Documentation
https://platform.openai.com/docs/api-reference/audio/createTranscription
Add Wizper V3 from fal.ai
Please add https://fal.ai/models/fal-ai/wizper to the evaluation
API Documentation:
https://fal.ai/models/fal-ai/wizper/api
Add cartesia-ink
Please add https://cartesia.ai/ink to the evaluation
API doc
https://docs.cartesia.ai/2025-04-16/build-with-cartesia/models/stt#ink-whisper
@Steveeeeeeen : What do you think, do you have plans to add more models?
There are just many lacking models. Most benchmarks are not independent. OpenASR is a relief here. I found one more benchmark here, https://voicewriter.io/speech-recognition-leaderboard , they also test the Gemini models where they are in the Top 3. The benchmark however is not as comprehensive as the OpenASR.