Spaces:
Running
title: IPA Transcription Leaderboard
emoji: ๐
colorFrom: indigo
colorTo: blue
sdk: gradio
sdk_version: 5.12.0
app_file: app/app.py
pinned: true
license: agpl-3.0
thumbnail: >-
https://cdn-uploads.huggingface.co/production/uploads/61dd07bafdc070745eed96fd/QC0vfJ-i0oc77NAM8Fdjs.png
short_description: Speech-to-phoneme leaderboard
๐ฏ English Phonemic Transcription Leaderboard
Welcome to the English Phonemic Transcription Leaderboard! This simple leaderboard helps track and compare the performance of different speech-to-phoneme models. Feel free to fork it for your own hugging face leaderboards!
โจ Features
- ๐ Interactive leaderboard with real-time sorting
- ๐ Easy model submission system
- ๐ Automatic evaluation of submitted models
- ๐ฑ Responsive design that works on all devices
๐ฏ What This Project Does
This leaderboard tracks two key metrics for phonemic transcription models:
- PER (Phoneme Error Rate): How accurately your model converts speech to phonemes
- PWED (Phoneme Weighted Edit Distance): A more nuanced metric that considers phonemic features
Read more about evaluations on our blog
Models are evaluated on the TIMIT speech corpus, a gold standard in speech recognition research.
๐ Getting Started
Navigate to the hosted version on Hugging Face or follow the instructions in DEVELOPMENT.md to run the leaderboard locally.
๐ฎ Using the Leaderboard
Submitting a Model
- Go to the "Submit Model" tab
- Enter your model details:
- Model name (e.g., "wav2vec2-phoneme-wizard")
- Submission name (e.g., "MyAwesomeModel v1.0")
- GitHub/Kaggle/HuggingFace URL (optional)
- Click Submit and watch your model climb the ranks! ๐
Checking Model Status
- Navigate to the "Model Status" tab
- Enter your model name or task ID
- Get real-time updates on your model's evaluation progress
๐ Understanding the Results
The leaderboard shows:
- Model names and submission details
- PER and PWED scores (lower is better!)
- Links to model repositories
- Submission dates
Sort by either metric to see who's leading the pack!
๐ ๏ธ Technical Details
- Built with Gradio for a smooth UI experience
- Runs on a basic compute plan (16GB RAM, 2vCPUs) for easy reproducibility
- Evaluation can take several hours - perfect time to grab a coffee โ
๐ค Contributing
Want to make this leaderboard even better? We'd love your help! Here are some ways you can contribute:
- Add new evaluation metrics
- Improve the UI design
- Enhance documentation
- Submit bug fixes
- Add new features
Checkout the CONTRIBUTING.md for more details.
๐ License
This project is licensed under the GNU Affero General Public License.
We retain all rights to the Koel Labs brand, logos, blog posts and website content.
๐ Acknowledgments
- Thanks to the TIMIT speech corpus for providing evaluation data
- Shoutout to the panphon library for PWED calculations
- Built with love by Koel Labs ๐
๐ Need Help?
Got questions? Found a bug? Want to contribute? Open an issue or reach out to us! We're here to help make speech recognition evaluation fun and accessible for everyone!
Remember: Every great model deserves its moment to shine! ๐
Happy Transcribing! ๐คโจ