IPA-Transcription-EN

Running

App Files Files

xet

Community

SanderGi commited on Sep 4

Commit

c2e60bb

1 Parent(s): 007a01f

fix and make functional, add more datasets

Browse files

Files changed (34) hide show

CONTRIBUTING.md +1 -1
DEVELOPMENT.md +45 -26
README.md +9 -14
app/app.py +76 -48
app/data.py +0 -180
app/data/test/cache-38f74914f01da443.arrow +3 -0
app/data/test/cache-43bad43a3f17100a.arrow +3 -0
app/data/test/cache-7fc832a0865b46e3.arrow +3 -0
app/data/test/cache-8e3b20205f12c8bf.arrow +3 -0
app/data/test/cache-9a41aaef1a199c0a.arrow +3 -0
app/data/test/cache-9a81afba5c72d77e.arrow +3 -0
app/data/test/cache-bf2efb6be770547b.arrow +3 -0
app/data/test/cache-ceccabba78df3ad3.arrow +3 -0
app/data/test/cache-d8c639c50adcd3ec.arrow +3 -0
app/data/test/cache-f9690e73716e8fdd.arrow +3 -0
app/data/test/data-00000-of-00001.arrow +3 -0
app/data/test/dataset_info.json +19 -0
app/data/test/state.json +17 -0
app/hf.py +111 -0
app/inference.py +36 -148
app/metrics.py +33 -0
app/phone_metrics.py +0 -108
app/queue/leaderboard.json +0 -192
app/queue/results.json +0 -1014
app/queue/tasks.json +0 -237
app/tasks.py +84 -191
requirements.txt +13 -7
requirements_lock.txt +98 -26
scripts/download_data_curl.sh +0 -3
scripts/download_data_lfs.sh +0 -2
scripts/install.sh +0 -19
scripts/run-dev.sh +0 -6
scripts/run-prod.sh +0 -6
scripts/sample_test_set.py +33 -0

CONTRIBUTING.md CHANGED Viewed

@@ -1,7 +1,7 @@
 # Contributing to Koel Labs - IPA Transcription EN
 👍🎉 First off, thanks for taking the time to contribute! 🎉👍
-These are the specific contributing guidelines for the English IPA transcription leaderboard. Checkout our [general contributing guidelines here](https://github.com/KoelLabs/.github/blob/main/CONTRIBUTING.md).
 ## Where to Start

 # Contributing to Koel Labs - IPA Transcription EN
 👍🎉 First off, thanks for taking the time to contribute! 🎉👍
+These are the specific contributing guidelines for the English IPA transcription leaderboard. Check out our [general contributing guidelines here](https://github.com/KoelLabs/.github/blob/main/CONTRIBUTING.md).
 ## Where to Start

DEVELOPMENT.md CHANGED Viewed

@@ -2,47 +2,69 @@
 ## Design Decisions
-We specifically opt for a single-space leaderboard for simplicity. We solve the issue of keeping the gradio UI interactive while models are evaluating by using background tasks instead of a separate space.
-## Setup
 ### Prerequisites
-* Python 3.10
-* Git
 * A love for speech recognition! 🎤
 ### Quick Installation
-1. Clone this repository:
 ```bash
-GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/spaces/KoelLabs/IPA-Transcription-EN
-cd IPA-Transcription-EN
 ```
-2. Set up your environment and download data:
 ```bash
-. ./scripts/install.sh
 ```
-3. Launch the leaderboard in development mode (auto-reloads on code changes):
 ```bash
-. ./scripts/run-dev.sh
 ```
 4. Visit `http://localhost:7860` in your browser and see the magic! ✨
-## Adding/Removing Dependencies
 0. Activate the virtual environment with `. ./venv/bin/activate`
 1. Add the dependency to `requirements.txt` (or remove it)
-2. Make sure you have no unused dependencies with `pipx run deptry .`
 3. Run `pip install -r requirements.txt`
 4. Freeze the dependencies with `pip freeze > requirements_lock.txt`
-## Run without reloading
-```bash
-. ./scripts/run-prod.sh
-```
 ## File Structure
@@ -56,19 +78,16 @@ IPA-Transcription-EN/
 ├── requirements.txt            # Python dependencies
 ├── requirements_lock.txt       # Locked dependencies
 ├── scripts                     # Helper scripts
-│   ├── install.sh              # Install dependencies and download data
 │   └── run-dev.sh              # Run the leaderboard in development mode
 ├── venv                        # Virtual environment
 ├── app/                        # All application code lives here
-│   ├── data/                   # Phoneme transcription datasets
-│   ├── queue/                  # Stores leaderboard state and task status
-│   |   ├── tasks.json          # Task queue
-│   |   ├── results.json        # Detailed evaluation results
-│   |   └── leaderboard.json    # Compact results for leaderboard display
 │   ├── app.py                  # Main Gradio UI
-│   ├── tasks.py                # Background tasks for model evaluation
-│   ├── data.py                 # Data loading and processing
 │   ├── inference.py            # Model inference
-│   └── phone_metrics.py        # Evaluation metrics
 └── img/                        # Images for README and other documentation
 ```

 ## Design Decisions
+We specifically opt for a single-space leaderboard for simplicity. We solve the issue of keeping the gradio UI interactive while models are evaluating by using multiprocessing instead of a separate space. Leaderboard entries are persisted in a Huggingface Dataset to avoid paying for persistent storage. Tasks are deliberately ephemeral.
+## Local Setup
 ### Prerequisites
+* [Python 3.10](https://www.python.org/downloads/release/python-31017/)
+* [Git](https://git-scm.com/downloads)
 * A love for speech recognition! 🎤
 ### Quick Installation
+0. Make sure git-lfs is installed (https://git-lfs.com)
 ```bash
+git lfs install
 ```
+1. Clone this repository:
 ```bash
+git clone https://huggingface.co/spaces/KoelLabs/IPA-Transcription-EN
 ```
+2. Setup your environment:
 ```bash
+# Create a virtual environment with Python 3.10
+python3.10 -m venv venv
+# Activate the virtual environment
+. ./venv/bin/activate
+# use `deactivate` to exit out of it
+# Install the required dependencies
+pip install -r requirements_lock.txt
+# Add a HF_TOKEN with access to your backing dataset (in app/hf.py) and any models you want to be able to run
+huggingface-cli login
 ```
+3. Launch the leaderboard:
+```bash
+. ./scripts/run-dev.sh      # development mode (auto-reloads)
+. ./scripts/run-prod.sh     # production mode (no auto-reloads)
+```
 4. Visit `http://localhost:7860` in your browser and see the magic! ✨
+### Adding New Datasets
+The datasets are pre-processed into a single dataset stored in `app/data/test` with three columns: audio (16 kHz), ipa, and dataset (original source). This is done using the `scripts/sample_test_set.py` file. To add new datasets, add them to this script. Beware that existing leaderboard entries will need to be recalculated. You can do this locally by accessing the dataset corresponding to `LEADERBOARD_ID` stored in `app/hf.py`.
+### Adding/Removing Dependencies
 0. Activate the virtual environment with `. ./venv/bin/activate`
 1. Add the dependency to `requirements.txt` (or remove it)
+2. Make sure you have no unused dependencies with `pipx run deptry .` (if necessary `python -m pip install pipx`)
 3. Run `pip install -r requirements.txt`
 4. Freeze the dependencies with `pip freeze > requirements_lock.txt`
+## Forking Into Your Own Leaderboard
+0. Navigate to [the space](https://huggingface.co/spaces/KoelLabs/IPA-Transcription-EN), click the three dots on the right and select `Duplicate this Space`
+1. Modify the `LEADERBOARD_ID` in `app/hf.py` to be some dataset that you own that the new space can use to store data. You don't need to create the dataset but if you do, it should be empty.
+2. Open the settings in your new space and add a new secret `HF_TOKEN`. You can [create it here](https://huggingface.co/settings/tokens). It just needs read access to all models you want to add to the leaderboard and write access to the private backing dataset specified by `LEADERBOARD_ID`.
+3. Submit some models and enjoy!
 ## File Structure
 ├── requirements.txt            # Python dependencies
 ├── requirements_lock.txt       # Locked dependencies
 ├── scripts                     # Helper scripts
+│   ├── sample_test_set.py      # Compute the combined test set
+│   ├── run-prod.sh             # Run the leaderboard in production mode
 │   └── run-dev.sh              # Run the leaderboard in development mode
 ├── venv                        # Virtual environment
 ├── app/                        # All application code lives here
+│   ├── data/                   # Phoneme transcription test set
 │   ├── app.py                  # Main Gradio UI
+│   ├── hf.py                   # Interface with the Huggingface API
 │   ├── inference.py            # Model inference
+│   └── metrics.py              # Evaluation metrics
+│   ├── tasks.py                # Background tasks for model evaluation
 └── img/                        # Images for README and other documentation
 ```

README.md CHANGED Viewed

@@ -13,6 +13,8 @@ thumbnail: >-
 short_description: Speech-to-phoneme leaderboard
 ---
 # 🎯 English Phonemic Transcription Leaderboard
 Welcome to the English Phonemic Transcription Leaderboard! This simple leaderboard helps track and compare the performance of different speech-to-phoneme models. Feel free to fork it for your own hugging face leaderboards!
@@ -30,13 +32,12 @@ Welcome to the English Phonemic Transcription Leaderboard! This simple leaderboa
 This leaderboard tracks two key metrics for phonemic transcription models:
 * **PER (Phoneme Error Rate)**: How accurately your model converts speech to phonemes
-* **PWED (Phoneme Weighted Edit Distance)**: A more nuanced metric that considers phonemic features
 Read more about evaluations on our [blog](https://www.koellabs.com/blog/phonemic-transcription-metrics)
-Models are evaluated on the TIMIT speech corpus, a gold standard in speech recognition research.
 ## 🚀 Getting Started
@@ -48,7 +49,7 @@ Navigate to the hosted version on [Hugging Face](https://huggingface.co/spaces/K
 1. Go to the "Submit Model" tab
 2. Enter your model details:
-   * Model name (e.g., "wav2vec2-phoneme-wizard")
    * Submission name (e.g., "MyAwesomeModel v1.0")
    * GitHub/Kaggle/HuggingFace URL (optional)
 3. Click Submit and watch your model climb the ranks! 🚀
@@ -56,7 +57,7 @@ Navigate to the hosted version on [Hugging Face](https://huggingface.co/spaces/K
 ### Checking Model Status
 1. Navigate to the "Model Status" tab
-2. Enter your model name or task ID
 3. Get real-time updates on your model's evaluation progress
 ## 📊 Understanding the Results
@@ -64,7 +65,7 @@ Navigate to the hosted version on [Hugging Face](https://huggingface.co/spaces/K
 The leaderboard shows:
 * Model names and submission details
-* PER and PWED scores (lower is better!)
 * Links to model repositories
 * Submission dates
@@ -86,7 +87,7 @@ Want to make this leaderboard even better? We'd love your help! Here are some wa
 * Submit bug fixes
 * Add new features
-Checkout the [CONTRIBUTING.md](CONTRIBUTING.md) for more details.
 ## 📝 License
@@ -94,12 +95,6 @@ This project is licensed under the GNU Affero General Public License.
 We retain all rights to the Koel Labs brand, logos, blog posts and website content.
-## 🌟 Acknowledgments
-* Thanks to the TIMIT speech corpus for providing evaluation data
-* Shoutout to the [panphon library](https://github.com/dmort27/panphon) for PWED calculations
-* Built with love by Koel Labs 💙
 ## 🆘 Need Help?
 Got questions? Found a bug? Want to contribute? [Open an issue](https://huggingface.co/spaces/KoelLabs/IPA-Transcription-EN/discussions) or [reach out to us](mailto:[email protected])! We're here to help make speech recognition evaluation fun and accessible for everyone!
@@ -108,4 +103,4 @@ Remember: Every great model deserves its moment to shine! 🌟
 ---
-Happy Transcribing! 🎤✨

 short_description: Speech-to-phoneme leaderboard
 ---
+![Koel Labs logo](img/logo-white.png)
 # 🎯 English Phonemic Transcription Leaderboard
 Welcome to the English Phonemic Transcription Leaderboard! This simple leaderboard helps track and compare the performance of different speech-to-phoneme models. Feel free to fork it for your own hugging face leaderboards!
 This leaderboard tracks two key metrics for phonemic transcription models:
 * **PER (Phoneme Error Rate)**: How accurately your model converts speech to phonemes
+* **FER (Feature Error Rate)**: A more nuanced metric that considers phonemic features
 Read more about evaluations on our [blog](https://www.koellabs.com/blog/phonemic-transcription-metrics)
+Models are evaluated on a variety of English speech: native, non-native, and impaired.
 ## 🚀 Getting Started
 1. Go to the "Submit Model" tab
 2. Enter your model details:
+   * Model ID (e.g., "my-name/wav2vec2-phoneme-wizard")
    * Submission name (e.g., "MyAwesomeModel v1.0")
    * GitHub/Kaggle/HuggingFace URL (optional)
 3. Click Submit and watch your model climb the ranks! 🚀
 ### Checking Model Status
 1. Navigate to the "Model Status" tab
+2. Enter your model ID or task ID
 3. Get real-time updates on your model's evaluation progress
 ## 📊 Understanding the Results
 The leaderboard shows:
 * Model names and submission details
+* PER and FER scores (lower is better!)
 * Links to model repositories
 * Submission dates
 * Submit bug fixes
 * Add new features
+Check out the [CONTRIBUTING.md](CONTRIBUTING.md) for more details.
 ## 📝 License
 We retain all rights to the Koel Labs brand, logos, blog posts and website content.
 ## 🆘 Need Help?
 Got questions? Found a bug? Want to contribute? [Open an issue](https://huggingface.co/spaces/KoelLabs/IPA-Transcription-EN/discussions) or [reach out to us](mailto:[email protected])! We're here to help make speech recognition evaluation fun and accessible for everyone!
 ---
+Happy Transcribing! 🎤✨

app/app.py CHANGED Viewed

@@ -1,50 +1,66 @@
 # This is the main module that handles rendering the Gradio interface.
-# Note: gradio will automatically create REST API endpoints for the functions that are used as event handlers in the interface.
 import gradio as gr
 import pandas as pd
-from tasks import start_eval_task, get_leaderboard_data, get_status
-def get_latest_leaderboard_html(sort_option: str) -> str:
     try:
         # Get the latest leaderboard data
-        df = get_leaderboard_data()
-        # Sort the dataframe so smallest PER or PWED is at the top
-        sort_column = "average_per" if sort_option.lower() == "per" else "average_pwed"
         df = df.sort_values(by=sort_column, ascending=True)
         # Format the dataframe for HTML display
         df = pd.DataFrame(
             {
-                "Model": df["model"],
-                "Average PER ⬇️": df["average_per"].apply(lambda x: f"{x:.4f}"),
-                "Average PWED ⬇️": df["average_pwed"].apply(lambda x: f"{x:.4f}"),
-                "Link": df["github_url"].apply(
                     lambda x: (
                         f'<a href="{x}" target="_blank">Repository</a>' if x else "N/A"
                     )
                 ),
-                "Submission Date": pd.to_datetime(df["submission_date"]).dt.strftime(
-                    "%Y-%m-%d"
-                ),
             }
         )
         return df.to_html(escape=False, index=False, classes="styled-table")
     except Exception as e:
-        print(f"Error updating leaderboard: {e}")
-        return "Error updating leaderboard"
-def submit_evaluation(model_name: str, submission_name: str, github_url: str) -> str:
-    if not model_name or not submission_name:
         return "⚠️ Please provide both model name and submission name."
     try:
-        task_id = start_eval_task(model_name, submission_name, github_url)
         return f"✅ Evaluation submitted successfully! Task ID: {task_id}"
     except Exception as e:
         return f"❌ Error: {str(e)}"
@@ -58,7 +74,6 @@ with gr.Blocks(
             margin: 25px 0;
             font-size: 0.9em;
             font-family: sans-serif;
-            box-shadow: 0 0 20px rgba(0, 0, 0, 0.15);
         }
         .styled-table thead tr {
             background: linear-gradient(45deg, #092746, #073562, #0A648F);
@@ -75,22 +90,18 @@ with gr.Blocks(
         }
     """
 ) as demo:
-    gr.Markdown("# 🎯 English Phonemic Transcription Leaderboard")
     gr.Markdown("#### Developed By: [Koel Labs](https://koellabs.com)")
     gr.Markdown(
         """
-    ## Explanation of Metrics
     - **PER (Phoneme Error Rate)**: The Levenshtein distance calculated between phoneme sequences of the predicted and actual transcriptions.
-    - **PWED (Phoneme Weighted Edit Distance)**: Edit distance between the predicted and actual phoneme sequences, weighted by the phonemic feature distance. Method by the [panphon library](https://github.com/dmort27/panphon)
-    Read more about evaluations on [our blog](https://www.koellabs.com/blog/phonemic-transcription-metrics)
-    """
-    )
-    gr.Markdown(
-        """
-    ## Test Set Information
-    The test set used for evaluation is from the [TIMIT speech corpus](https://www.kaggle.com/datasets/mfekadu/darpa-timit-acousticphonetic-continuous-speech). The TIMIT corpus is a widely used dataset for speech recognition research.
     ## Compute
     This leaderboard uses the free basic plan (16GB RAM, 2vCPUs) to allow for reproducability. The evaluation may take several hours to complete. Please be patient and do not submit the same model multiple times.
@@ -100,38 +111,55 @@ with gr.Blocks(
     )
     with gr.Tabs():
         with gr.TabItem("🏆 Leaderboard"):
             with gr.Row(elem_classes="controls-row"):
-                # Controls side by side
                 sort_dropdown = gr.Dropdown(
-                    choices=["PWED", "PER"],
-                    value="PWED",
                     interactive=True,
                     scale=2,
                     container=False,  # Removes the box around the dropdown
-                    label=None,  # Removes the "Sort by" label
                 )
-                refresh_btn = gr.Button("Refresh 🔄", scale=2)  # Simplified button text
-            leaderboard_html = gr.HTML(get_latest_leaderboard_html(sort_dropdown.value))
             sort_dropdown.change(
                 fn=get_latest_leaderboard_html,
-                inputs=[sort_dropdown],
                 outputs=leaderboard_html,
             )
             refresh_btn.click(
                 fn=get_latest_leaderboard_html,
-                inputs=[sort_dropdown],
                 outputs=leaderboard_html,
             )
         with gr.TabItem("📝 Submit Model"):
-            model_name = gr.Textbox(
-                label="Model Name", placeholder="facebook/wav2vec2-lv-60-espeak-cv-ft"
             )
-            submission_name = gr.Textbox(
-                label="Submission Name", placeholder="My Model v1.0"
             )
-            github_url = gr.Textbox(
                 label="Github/Kaggle/HF URL (optional)",
                 placeholder="https://github.com/username/repo",
             )
@@ -140,14 +168,14 @@ with gr.Blocks(
             submit_btn.click(
                 fn=submit_evaluation,
-                inputs=[model_name, submission_name, github_url],
                 outputs=result,
             )
-        with gr.TabItem("📊 Model Status"):
             query = gr.Textbox(
-                label="Model Name or Task ID",
-                placeholder="Enter model name (e.g., facebook/wav2vec2-lv-60-espeak-cv-ft)",
             )
             status_btn = gr.Button("Check Status")
             status_output = gr.JSON(label="Status")

 # This is the main module that handles rendering the Gradio interface.
+# NOTE: gradio will automatically create REST API endpoints for the functions that are used as event handlers in the interface.
 import gradio as gr
 import pandas as pd
+from tasks import start_eval_task, get_status
+from hf import get_or_create_leaderboard
+def get_latest_leaderboard_html(datasets: list[str], sort_option: str) -> str:
     try:
         # Get the latest leaderboard data
+        df: pd.DataFrame = get_or_create_leaderboard().sort("submission_timestamp", reverse=True).to_pandas()  # type: ignore
+        df = df.drop_duplicates("repo_id", keep="first")
+        if len(df) == 0:
+            return "No scores, please submit models for evaluation."
+        # Sort the dataframe so smallest PER or FER is at the top
+        sort_column = "average_per" if sort_option.lower() == "per" else "average_fer"
         df = df.sort_values(by=sort_column, ascending=True)
         # Format the dataframe for HTML display
         df = pd.DataFrame(
             {
+                "Model": df.apply(
+                    lambda r: f'<a href="https://huggingface.co/{r["repo_id"]}" target="_blank">{r["display_name"]}</a>',
+                    axis=1,
+                ),
+                "Average PER ⬇️": df["average_per"].apply(lambda x: f"{100 * x:.2f}%"),
+            }
+            | {
+                f"{d} FER ⬇️": df["average_fer" if d == "Average" else f"fer_{d}"].apply(
+                    lambda x: f"{100 * x:.2f}%"
+                )
+                for d in datasets
+            }
+            | {
+                "Link": df["url"].apply(
                     lambda x: (
                         f'<a href="{x}" target="_blank">Repository</a>' if x else "N/A"
                     )
                 ),
+                "Submission Date": pd.to_datetime(
+                    df["submission_timestamp"]
+                ).dt.strftime("%Y-%m-%d"),
             }
         )
         return df.to_html(escape=False, index=False, classes="styled-table")
     except Exception as e:
+        return f"Error updating leaderboard: {type(e).__name__} - {e}"
+def submit_evaluation(model_id: str, display_name: str, url: str) -> str:
+    model_id = model_id.strip()
+    display_name = display_name.strip()
+    if not model_id or not display_name:
         return "⚠️ Please provide both model name and submission name."
     try:
+        task_id = start_eval_task(display_name, model_id, url)
         return f"✅ Evaluation submitted successfully! Task ID: {task_id}"
     except Exception as e:
         return f"❌ Error: {str(e)}"
             margin: 25px 0;
             font-size: 0.9em;
             font-family: sans-serif;
         }
         .styled-table thead tr {
             background: linear-gradient(45deg, #092746, #073562, #0A648F);
         }
     """
 ) as demo:
+    gr.Markdown("# 🎯 English Speech2IPA Leaderboard")
     gr.Markdown("#### Developed By: [Koel Labs](https://koellabs.com)")
     gr.Markdown(
         """
+    ## Evaluation
+    We use two standard metrics:
     - **PER (Phoneme Error Rate)**: The Levenshtein distance calculated between phoneme sequences of the predicted and actual transcriptions.
+    - **FER (Feature Error Rate)**: The edit distance between the predicted and actual phoneme sequences, weighted by the phonetic features from [panphon](https://github.com/dmort27/panphon).
+    Models are evaluated on a variety of English speech: native, non-native, and impaired. Read more about evaluations on [our blog](https://www.koellabs.com/blog/phonemic-transcription-metrics)
     ## Compute
     This leaderboard uses the free basic plan (16GB RAM, 2vCPUs) to allow for reproducability. The evaluation may take several hours to complete. Please be patient and do not submit the same model multiple times.
     )
     with gr.Tabs():
         with gr.TabItem("🏆 Leaderboard"):
+            dataset_dropdown = gr.Dropdown(
+                choices=["Average", "TIMIT", "EpaDB", "PSST", "SpeechOcean", "ISLE"],
+                value=["Average"],
+                multiselect=True,
+                interactive=True,
+                scale=2,
+                container=False,  # Removes the box around the dropdown
+            )
             with gr.Row(elem_classes="controls-row"):
                 sort_dropdown = gr.Dropdown(
+                    choices=["FER", "PER"],
+                    value="FER",
                     interactive=True,
                     scale=2,
                     container=False,  # Removes the box around the dropdown
                 )
+                refresh_btn = gr.Button("Refresh 🔄", scale=2)
+            leaderboard_html = gr.HTML("Loading Leaderboard...")
+            demo.load(
+                fn=get_latest_leaderboard_html,
+                inputs=[dataset_dropdown, sort_dropdown],
+                outputs=leaderboard_html,
+                show_progress="minimal",
+            )
+            dataset_dropdown.change(
+                fn=get_latest_leaderboard_html,
+                inputs=[dataset_dropdown, sort_dropdown],
+                outputs=leaderboard_html,
+            )
             sort_dropdown.change(
                 fn=get_latest_leaderboard_html,
+                inputs=[dataset_dropdown, sort_dropdown],
                 outputs=leaderboard_html,
             )
             refresh_btn.click(
                 fn=get_latest_leaderboard_html,
+                inputs=[dataset_dropdown, sort_dropdown],
                 outputs=leaderboard_html,
             )
         with gr.TabItem("📝 Submit Model"):
+            model_id = gr.Textbox(
+                label="Model ID", placeholder="facebook/wav2vec2-lv-60-espeak-cv-ft"
             )
+            display_name = gr.Textbox(
+                label="Submission Name", placeholder="Facebook Wav2Vec2 Espeak 60"
             )
+            url = gr.Textbox(
                 label="Github/Kaggle/HF URL (optional)",
                 placeholder="https://github.com/username/repo",
             )
             submit_btn.click(
                 fn=submit_evaluation,
+                inputs=[model_id, display_name, url],
                 outputs=result,
             )
+        with gr.TabItem("📊 Submission Status"):
             query = gr.Textbox(
+                label="Model ID or Task ID",
+                placeholder="Enter model ID (e.g., facebook/wav2vec2-lv-60-espeak-cv-ft)",
             )
             status_btn = gr.Button("Check Status")
             status_output = gr.JSON(label="Status")

app/data.py DELETED Viewed

@@ -1,180 +0,0 @@
-# This module handles the data loading and preprocessing for various phoneme transcription datasets.
-import torch
-import torchaudio
-import zipfile
-from pathlib import Path
-# Get absolute path
-CURRENT_DIR = Path(__file__).parent.absolute()
-# Constants
-DATA_DIR = CURRENT_DIR / "data"
-TIMIT_PATH = DATA_DIR / "TIMIT.zip"
-# Abstract data manager class
-class DataManager:
-    """Abstract class for handling dataset operations"""
-    def get_file_list(self, subset: str) -> list[str]:
-        """Get list of files for given subset"""
-        raise NotImplementedError
-    def load_audio(self, filename: str) -> torch.Tensor:
-        """Load and preprocess audio file"""
-        raise NotImplementedError
-    def get_phonemes(self, filename: str) -> str:
-        """Get phoneme sequence from file"""
-        raise NotImplementedError
-# Implement datasets
-class TimitDataManager(DataManager):
-    """Handles all TIMIT dataset operations"""
-    # TIMIT to IPA mapping with direct simplifications
-    _TIMIT_TO_IPA = {
-        # Vowels (simplified)
-        "aa": "ɑ",
-        "ae": "æ",
-        "ah": "ʌ",
-        "ao": "ɔ",
-        "aw": "aʊ",
-        "ay": "aɪ",
-        "eh": "ɛ",
-        "er": "ɹ",  # Simplified from 'ɝ'
-        "ey": "eɪ",
-        "ih": "ɪ",
-        "ix": "i",  # Simplified from 'ɨ'
-        "iy": "i",
-        "ow": "oʊ",
-        "oy": "ɔɪ",
-        "uh": "ʊ",
-        "uw": "u",
-        "ux": "u",  # Simplified from 'ʉ'
-        "ax": "ə",
-        "ax-h": "ə",  # Simplified from 'ə̥'
-        "axr": "ɹ",  # Simplified from 'ɚ'
-        # Consonants
-        "b": "",
-        "bcl": "b",
-        "d": "",
-        "dcl": "d",
-        "g": "",
-        "gcl": "g",
-        "p": "",
-        "pcl": "p",
-        "t": "",
-        "tcl": "t",
-        "k": "",
-        "kcl": "k",
-        "dx": "ɾ",
-        "q": "ʔ",
-        # Fricatives
-        "jh": "dʒ",
-        "ch": "tʃ",
-        "s": "s",
-        "sh": "ʃ",
-        "z": "z",
-        "zh": "ʒ",
-        "f": "f",
-        "th": "θ",
-        "v": "v",
-        "dh": "ð",
-        "hh": "h",
-        "hv": "h",  # Simplified from 'ɦ'
-        # Nasals (simplified)
-        "m": "m",
-        "n": "n",
-        "ng": "ŋ",
-        "em": "m",  # Simplified from 'm̩'
-        "en": "n",  # Simplified from 'n̩'
-        "eng": "ŋ",  # Simplified from 'ŋ̍'
-        "nx": "ɾ",  # Simplified from 'ɾ̃'
-        # Semivowels and Glides
-        "l": "l",
-        "r": "ɹ",
-        "w": "w",
-        "wh": "ʍ",
-        "y": "j",
-        "el": "l",  # Simplified from 'l̩'
-        # Special
-        "epi": "",  # Remove epenthetic silence
-        "h#": "",  # Remove start/end silence
-        "pau": "",  # Remove pause
-    }
-    def __init__(self, timit_path: Path):
-        self.timit_path = timit_path
-        self._zip_ = None
-        print(f"TimitDataManager initialized with path: {self.timit_path.absolute()}")
-        if not self.timit_path.exists():
-            raise FileNotFoundError(
-                f"TIMIT dataset not found at {self.timit_path.absolute()}. Try running ./scripts/download_data_lfs.sh again."
-            )
-        else:
-            print("TIMIT dataset file exists!")
-    @property
-    def _zip(self):
-        if not self._zip_:
-            self._zip_ = zipfile.ZipFile(self.timit_path, "r")
-        return self._zip_
-    def get_file_list(self, subset: str) -> list[str]:
-        """Get list of WAV files for given subset"""
-        files = [
-            f
-            for f in self._zip.namelist()
-            if f.endswith(".WAV") and subset.lower() in f.lower()
-        ]
-        print(f"Found {len(files)} WAV files in {subset} subset")
-        if files:
-            print("First 3 files:", files[:3])
-        return files
-    def load_audio(self, filename: str) -> torch.Tensor:
-        """Load and preprocess audio file"""
-        with self._zip.open(filename) as wav_file:
-            waveform, sample_rate = torchaudio.load(wav_file)  # type: ignore
-            if waveform.shape[0] > 1:
-                waveform = torch.mean(waveform, dim=0, keepdim=True)
-            if sample_rate != 16000:
-                waveform = torchaudio.transforms.Resample(sample_rate, 16000)(waveform)
-            waveform = (waveform - waveform.mean()) / (waveform.std() + 1e-7)
-            if waveform.dim() == 1:
-                waveform = waveform.unsqueeze(0)
-            return waveform
-    def get_phonemes(self, filename: str) -> str:
-        """Get cleaned phoneme sequence from PHN file and convert to IPA"""
-        phn_file = filename.replace(".WAV", ".PHN")
-        with self._zip.open(phn_file) as f:
-            phonemes = []
-            for line in f.read().decode("utf-8").splitlines():
-                if line.strip():
-                    _, _, phone = line.split()
-                    phone = self._remove_stress_mark(phone)
-                    # Convert to IPA instead of using simplify_timit
-                    ipa = self._TIMIT_TO_IPA.get(phone.lower(), "")
-                    if ipa:
-                        phonemes.append(ipa)
-            return "".join(phonemes)  # Join without spaces for IPA
-    def _remove_stress_mark(self, text: str) -> str:
-        """Removes the combining double inverted breve (͡) from text"""
-        if not isinstance(text, str):
-            raise TypeError("Input must be string")
-        return text.replace("͡", "")
-# Initialize data managers
-timit_manager = TimitDataManager(TIMIT_PATH)

app/data/test/cache-38f74914f01da443.arrow ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a7097497f3a64b59d868eb2b3dadf6887b383555398dec8f3b72e75a295ddb5a
+size 1248

app/data/test/cache-43bad43a3f17100a.arrow ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6a87f7da6c1210c5efca97e285fdf608b1101e8c6b506a03812ecf082f089aa0
+size 1248

app/data/test/cache-7fc832a0865b46e3.arrow ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:966ac1866fb81a68bcb2269ad1293dd2c045022558b6824a87bb66cada9ff28a
+size 1248

app/data/test/cache-8e3b20205f12c8bf.arrow ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:78881cfe43c3a668b24c2269adc6219724a7fec0838bcdf74b71e96a583bf0c6
+size 1248

app/data/test/cache-9a41aaef1a199c0a.arrow ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d5c1ab32866ac66f93c5798a888db2d32ca3638aa119de45d587325c2d90964d
+size 1248

app/data/test/cache-9a81afba5c72d77e.arrow ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:aee3c9a01bfb57a914f31c6255c55cdd42c5cbda23fab357fb80c32710e92389
+size 1248

app/data/test/cache-bf2efb6be770547b.arrow ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e2b2b01c7d81595b4ba5e97902db7bf2ef353eacebf9912a930d16570948cd2d
+size 1248

app/data/test/cache-ceccabba78df3ad3.arrow ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:312a7ed183b7aabac6a4553b31fd55dcd6a4af9a1627978f8117278c540885da
+size 1248

app/data/test/cache-d8c639c50adcd3ec.arrow ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4347f285083472be0661457b0b6cdf927a302e556a5584d10cdedd15ca936919
+size 1248

app/data/test/cache-f9690e73716e8fdd.arrow ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d0c055e60b8afcc4c763157f34d0d17f683f0f8b578116eaf9e604a3d178d9e5
+size 1248

app/data/test/data-00000-of-00001.arrow ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:510501aa7be7ece974c2e9feaaad94ec5d38a7fe4e35dee9b3bf2ee9a485062c
+size 53582720

app/data/test/dataset_info.json ADDED Viewed

	@@ -0,0 +1,19 @@

+{
+  "citation": "",
+  "description": "",
+  "features": {
+    "audio": {
+      "_type": "Audio"
+    },
+    "ipa": {
+      "dtype": "string",
+      "_type": "Value"
+    },
+    "dataset": {
+      "dtype": "string",
+      "_type": "Value"
+    }
+  },
+  "homepage": "",
+  "license": ""
+}

app/data/test/state.json ADDED Viewed

	@@ -0,0 +1,17 @@

+{
+  "_data_files": [
+    {
+      "filename": "data-00000-of-00001.arrow"
+    }
+  ],
+  "_fingerprint": "8693a894a9182281",
+  "_format_columns": [
+    "audio",
+    "ipa",
+    "dataset"
+  ],
+  "_format_kwargs": {},
+  "_format_type": null,
+  "_output_all_columns": false,
+  "_split": null
+}

app/hf.py ADDED Viewed

	@@ -0,0 +1,111 @@

+# This module handles interfacing with the huggingface api
+from typing import Literal
+from datetime import datetime
+from huggingface_hub import HfApi
+from huggingface_hub.errors import RepositoryNotFoundError
+from datasets import load_dataset, concatenate_datasets, Dataset, Features, Value
+from datasets.exceptions import DatasetNotFoundError
+api = HfApi()
+LEADERBOARD_ID = "KoelLabs/_IPA-TRANSCRIPTION-EN-SCORES"
+LEADERBOARD_FEATURES = Features(
+    {
+        "display_name": Value("string"),
+        "repo_id": Value("string"),
+        "repo_hash": Value("string"),
+        "repo_last_modified": Value("timestamp[s, tz=UTC]"),
+        "submission_timestamp": Value("timestamp[s, tz=UTC]"),
+        "average_per": Value("float32"),
+        "average_fer": Value("float32"),
+        "url": Value("string"),
+        "fer_TIMIT": Value("float32"),
+        "fer_EpaDB": Value("float32"),
+        "fer_PSST": Value("float32"),
+        "fer_SpeechOcean": Value("float32"),
+        "fer_ISLE": Value("float32"),
+    }
+)
+LEADERBOARD_DEFAULTS = {
+    "url": "",
+    "fer_TIMIT": None,
+    "fer_EpaDB": None,
+    "fer_PSST": None,
+    "fer_SpeechOcean": None,
+    "fer_ISLE": None,
+}
+def get_repo_info(
+    repo_id, type: Literal["model", "dataset", "space"] = "model"
+) -> tuple[str, datetime]:
+    try:
+        repo_info = api.repo_info(repo_id=repo_id, repo_type=type)
+        return repo_info.sha, repo_info.last_modified  # type: ignore
+    except RepositoryNotFoundError:
+        return "", datetime(year=1970, month=1, day=1)
+def get_or_create_leaderboard() -> Dataset:
+    modified = False
+    try:
+        dataset: Dataset = load_dataset(LEADERBOARD_ID)["train"]  # type: ignore
+    except DatasetNotFoundError:
+        empty_data = {col: [] for col in LEADERBOARD_FEATURES.keys()}
+        dataset = Dataset.from_dict(empty_data, features=LEADERBOARD_FEATURES)
+        modified = True
+    except ValueError:
+        empty_data = {col: [] for col in LEADERBOARD_FEATURES.keys()}
+        dataset = Dataset.from_dict(empty_data, features=LEADERBOARD_FEATURES)
+    for col in LEADERBOARD_FEATURES.keys():
+        if col not in dataset.column_names:
+            modified = True
+            dataset = dataset.add_column(col, [LEADERBOARD_DEFAULTS.get(col)] * len(dataset))  # type: ignore
+            dataset = dataset.cast_column(col, feature=LEADERBOARD_FEATURES[col])
+    if modified:
+        dataset.push_to_hub(LEADERBOARD_ID, private=True)
+    return dataset
+def add_leaderboard_entry(
+    display_name: str,
+    repo_id: str,
+    repo_hash: str,
+    repo_last_modified: datetime,
+    submission_timestamp: datetime,
+    average_per: float,
+    average_fer: float,
+    url: str,
+    per_dataset_fers: dict = {},
+):
+    existing_dataset = get_or_create_leaderboard()
+    new_row = Dataset.from_dict(
+        dict(
+            display_name=[display_name],
+            repo_id=[repo_id],
+            repo_hash=[repo_hash],
+            repo_last_modified=[repo_last_modified.replace(microsecond=0)],
+            submission_timestamp=[submission_timestamp.replace(microsecond=0)],
+            average_per=[average_per],
+            average_fer=[average_fer],
+            url=[url],
+            fer_TIMIT=[per_dataset_fers.get("TIMIT")],
+            fer_EpaDB=[per_dataset_fers.get("EpaDB")],
+            fer_PSST=[per_dataset_fers.get("PSST")],
+            fer_SpeechOcean=[per_dataset_fers.get("SpeechOcean")],
+            fer_ISLE=[per_dataset_fers.get("ISLE")],
+        ),
+        features=LEADERBOARD_FEATURES,
+    )
+    combined_dataset = concatenate_datasets([existing_dataset, new_row])
+    combined_dataset.push_to_hub(LEADERBOARD_ID, private=True)
+if __name__ == "__main__":
+    print(get_repo_info(LEADERBOARD_ID, type="dataset"))
+    print(get_or_create_leaderboard().to_pandas().head(5))  # type: ignore

app/inference.py CHANGED Viewed

@@ -1,162 +1,50 @@
-# This module handles model inference and evaluation.
-from datetime import datetime
-from typing import Optional
 import torch
 from transformers import AutoProcessor, AutoModelForCTC
-from data import timit_manager
-from phone_metrics import PhoneErrorMetrics
-# Initialize evaluation metric
-phone_errors = PhoneErrorMetrics()
-class ModelManager:
-    """Handles model loading and inference"""
-    def __init__(self):
-        self.models = {}
-        self.processors = {}
-        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-        self.batch_size = 32
-    def get_model_and_processor(self, model_name: str):
-        """Get or load model and processor"""
-        if model_name not in self.models:
-            print("Loading processor with phoneme tokenizer...")
-            processor = AutoProcessor.from_pretrained(model_name)
-            print("Loading model...", {model_name})
-            model = AutoModelForCTC.from_pretrained(model_name).to(self.device)
-            self.models[model_name] = model
-            self.processors[model_name] = processor
-        return self.models[model_name], self.processors[model_name]
-    def transcribe(self, audio_list: list[torch.Tensor], model_name: str) -> list[str]:
-        """Transcribe a batch of audio using specified model"""
-        model, processor = self.get_model_and_processor(model_name)
-        if not model or not processor:
-            raise Exception("Model and processor not loaded")
-        # Process audio in batches
-        all_predictions = []
-        for i in range(0, len(audio_list), self.batch_size):
-            batch_audio = audio_list[i : i + self.batch_size]
-            # Pad sequence within batch
-            max_length = max(audio.shape[-1] for audio in batch_audio)
-            padded_audio = torch.zeros((len(batch_audio), 1, max_length))
-            attention_mask = torch.zeros((len(batch_audio), max_length))
-            for j, audio in enumerate(batch_audio):
-                padded_audio[j, :, : audio.shape[-1]] = audio
-                attention_mask[j, : audio.shape[-1]] = 1
-            # Process batch
-            inputs = processor(
-                padded_audio.squeeze(1).numpy(),
-                sampling_rate=16000,
-                return_tensors="pt",
-                padding=True,
-            )
-            input_values = inputs.input_values.to(self.device)
-            attention_mask = inputs.get("attention_mask", attention_mask).to(
-                self.device
-            )
-            with torch.no_grad():
-                outputs = model(
-                    input_values=input_values, attention_mask=attention_mask
-                )
-                logits = outputs.logits
-                predicted_ids = torch.argmax(logits, dim=-1)
-                predictions = processor.batch_decode(
-                    predicted_ids, skip_special_tokens=True
-                )
-                predictions = [pred.replace(" ", "") for pred in predictions]
-                all_predictions.extend(predictions)
-        return all_predictions
-def evaluate_model(
-    model_name: str,
-    subset: str = "test",
-    max_samples: Optional[int] = None,
-):
-    """Evaluate model on TIMIT dataset"""
-    files = timit_manager.get_file_list(subset)
-    if max_samples:
-        files = files[:max_samples]
-    results = []
-    total_per = total_pwed = 0
-    # Process files in batches
-    batch_size = model_manager.batch_size
-    for i in range(0, len(files), batch_size):
-        batch_files = files[i : i + batch_size]
-        # Load batch audio and ground truth
-        batch_audio = []
-        batch_ground_truth = []
-        for wav_file in batch_files:
-            audio = timit_manager.load_audio(wav_file)
-            ground_truth = timit_manager.get_phonemes(wav_file)
-            batch_audio.append(audio)
-            batch_ground_truth.append(ground_truth)
-        # Get predictions for batch
-        predictions = model_manager.transcribe(batch_audio, model_name)
-        # Calculate metrics for each file in batch
-        for _, (wav_file, prediction, ground_truth) in enumerate(
-            zip(batch_files, predictions, batch_ground_truth)
-        ):
-            metrics = phone_errors.compute(
-                predictions=[prediction],
-                references=[ground_truth],
-                is_normalize_pfer=True,
-            )
-            per = metrics["phone_error_rates"][0]
-            pwed = metrics["phone_feature_error_rates"][0]
-            results.append(
-                {
-                    "file": wav_file,
-                    "ground_truth": ground_truth,
-                    "prediction": prediction,
-                    "per": per,
-                    "pwed": pwed,
-                }
-            )
-            total_per += per
-            total_pwed += pwed
-    if not results:
-        raise Exception("No files were successfully processed")
-    avg_per = total_per / len(results)
-    avg_pwed = total_pwed / len(results)
-    return {
-        "model": model_name,
-        "subset": subset,
-        "num_files": len(results),
-        "average_per": avg_per,
-        "average_pwed": avg_pwed,
-        "detailed_results": results[:5],
-        "timestamp": datetime.now().isoformat(),
-    }
-# Initialize managers
-model_manager = ModelManager()

+# This module handles model inference
 import torch
 from transformers import AutoProcessor, AutoModelForCTC
+DEVICE = (
+    "cuda"
+    if torch.cuda.is_available()
+    else "mps" if torch.backends.mps.is_available() else "cpu"
+)
+# set espeak library path for macOS
+import sys
+if sys.platform == "darwin":
+    from phonemizer.backend.espeak.wrapper import EspeakWrapper
+    _ESPEAK_LIBRARY = "/opt/homebrew/Cellar/espeak/1.48.04_1/lib/libespeak.1.1.48.dylib"
+    EspeakWrapper.set_library(_ESPEAK_LIBRARY)
+def clear_cache():
+    if torch.cuda.is_available():
+        torch.cuda.empty_cache()
+        torch.cuda.ipc_collect()
+    torch.mps.empty_cache()
+def load_model(model_id, device=DEVICE):
+    processor = AutoProcessor.from_pretrained(model_id)
+    model = AutoModelForCTC.from_pretrained(model_id).to(device)
+    return model, processor
+def transcribe(audio, model, processor) -> str:
+    input_values = (
+        processor(
+            [audio],
+            sampling_rate=processor.feature_extractor.sampling_rate,
+            return_tensors="pt",
+            padding=True,
+        )
+        .input_values.type(torch.float32)
+        .to(model.device)
+    )
+    with torch.no_grad():
+        logits = model(input_values).logits
+    predicted_ids = torch.argmax(logits, dim=-1)
+    return processor.decode(predicted_ids[0])

app/metrics.py ADDED Viewed

	@@ -0,0 +1,33 @@

+# This module defines evaluation metrics
+from yaml import warnings
+warnings({"YAMLLoadWarning": False})
+import panphon
+import panphon.distance
+ft = panphon.FeatureTable()
+panphon_dist = panphon.distance.Distance()
+inverse_double_weight_sum = 1 / (sum(ft.weights) * 2)
+def per(prediction, ground_truth):
+    """
+    Phoneme Error Rate: the number of edits (substitutions, insertions, deletions)
+    needed to transform the prediction into the ground truth divided by the length of the ground truth.
+    """
+    return panphon_dist.fast_levenshtein_distance(prediction, ground_truth) / len(
+        ground_truth
+    )
+def fer(prediction, ground_truth):
+    """
+    Feature Error Rate: the edits weighted by their acoustic features summed up and divided by the length of the ground truth.
+    """
+    return (
+        inverse_double_weight_sum
+        * panphon_dist.weighted_feature_edit_distance(ground_truth, prediction)
+        / len(ground_truth)
+    )

app/phone_metrics.py DELETED Viewed

@@ -1,108 +0,0 @@
-"""
-This module implements phone error metrics based on the work from ginic/phone_errors.
-Original implementation: https://huggingface.co/spaces/ginic/phone_errors
-Citation:
-@inproceedings{Mortensen-et-al:2016,
-  author    = {David R. Mortensen and
-               Patrick Littell and
-               Akash Bharadwaj and
-               Kartik Goyal and
-               Chris Dyer and
-               Lori S. Levin},
-  title     = {PanPhon: {A} Resource for Mapping {IPA} Segments to Articulatory Feature Vectors},
-  booktitle = {Proceedings of {COLING} 2016, the 26th International Conference on Computational Linguistics: Technical Papers},
-  pages     = {3475--3484},
-  publisher = {{ACL}},
-  year      = {2016}
-}
-"""
-import numpy as np
-import panphon.distance
-class PhoneErrorMetrics:
-    def __init__(self, feature_model: str = "segment"):
-        """Initialize the phone error metrics calculator.
-        Args:
-            feature_model (str): panphon feature parsing model ("strict", "permissive", or "segment")
-        """
-        self.distance_computer = panphon.distance.Distance(feature_model=feature_model)
-    def _phone_error_rate(self, prediction: str, reference: str) -> float:
-        """Compute phone error rate between prediction and reference.
-        Args:
-            prediction (str): Predicted IPA string
-            reference (str): Reference IPA string
-        Returns:
-            float: Phone error rate
-        """
-        if not reference:
-            raise ValueError("Reference string cannot be empty")
-        pred_phones = self.distance_computer.fm.ipa_segs(prediction)
-        ref_phones = self.distance_computer.fm.ipa_segs(reference)
-        phone_edits = self.distance_computer.min_edit_distance(
-            lambda x: 1,  # deletion cost
-            lambda x: 1,  # insertion cost
-            lambda x, y: 0 if x == y else 1,  # substitution cost
-            [[]],
-            pred_phones,
-            ref_phones,
-        )
-        return phone_edits / len(ref_phones)
-    def compute(
-        self,
-        predictions: list[str],
-        references: list[str],
-        is_normalize_pfer: bool = False,
-    ) -> dict:
-        """Compute phone error metrics between predictions and references.
-        Args:
-            predictions (List[str]): List of predicted IPA strings
-            references (List[str]): List of reference IPA strings
-            is_normalize_pfer (bool): Whether to normalize phone feature error rates
-        Returns:
-            Dict containing:
-                - phone_error_rates: List of PER for each pair
-                - mean_phone_error_rate: Average PER
-                - phone_feature_error_rates: List of PFER for each pair
-                - mean_phone_feature_error_rate: Average PFER
-                - feature_error_rates: List of FER for each pair
-                - mean_feature_error_rate: Average FER
-        """
-        phone_error_rates = []
-        feature_error_rates = []
-        hamming_distances = []
-        for pred, ref in zip(predictions, references):
-            if is_normalize_pfer:
-                hd = self.distance_computer.hamming_feature_edit_distance_div_maxlen(
-                    pred, ref
-                )
-            else:
-                hd = self.distance_computer.hamming_feature_edit_distance(pred, ref)
-            hamming_distances.append(hd)
-            per = self._phone_error_rate(pred, ref)
-            phone_error_rates.append(per)
-            fer = self.distance_computer.feature_error_rate(pred, ref)
-            feature_error_rates.append(fer)
-        return {
-            "phone_error_rates": phone_error_rates,
-            "mean_phone_error_rate": float(np.mean(phone_error_rates)),
-            "phone_feature_error_rates": hamming_distances,
-            "mean_phone_feature_error_rate": float(np.mean(hamming_distances)),
-            "feature_error_rates": feature_error_rates,
-            "mean_feature_error_rate": float(np.mean(feature_error_rates)),
-        }

app/queue/leaderboard.json DELETED Viewed

@@ -1,192 +0,0 @@
-[
-    {
-        "submission_id": "8e6a3a00-59fa-4a24-861d-a132a8212658",
-        "submission_name": "facebook espeak",
-        "model": "facebook/wav2vec2-lv-60-espeak-cv-ft",
-        "average_per": 0.33667301260691423,
-        "average_pwed": 0.1276725657099669,
-        "subset": "timit-test",
-        "github_url": "https://github.com/facebookresearch/fairseq/blob/main/examples/wav2vec/README.md",
-        "submission_date": "2024-12-05T07:32:06.850230"
-    },
-    {
-        "submission_id": "70aceb68-ad86-4a83-9998-08adb27b4d5c",
-        "submission_name": "english phoneme model",
-        "model": "KoelLabs/xlsr-timit-b0",
-        "average_per": 0.12572285528714347,
-        "average_pwed": 0.06476636812791145,
-        "subset": "timit-test",
-        "github_url": "https://github.com/KoelLabs/",
-        "submission_date": "2024-12-05T08:25:24.982477"
-    },
-    {
-        "submission_id": "80b57299-b3ab-4caf-ac4a-898c8398046e",
-        "submission_name": "speech 31 model",
-        "model": "speech31/wav2vec2-large-TIMIT-IPA",
-        "average_per": 0.4415425496841929,
-        "average_pwed": 0.18625930002594002,
-        "subset": "timit-test",
-        "github_url": "https://huggingface.co/speech31/wav2vec2-large-TIMIT-IPA2",
-        "submission_date": "2024-12-05T09:36:14.570315"
-    },
-    {
-        "submission_id": "0cbcab0a-bd07-421f-82a0-480c9507a214",
-        "submission_name": "jubiliano model wav2vec2",
-        "model": "Jubliano/wav2vec2-large-xls-r-300m-ipa-INTERNATIONAL1.5",
-        "average_per": 0.6318471187460027,
-        "average_pwed": 0.222932144739126,
-        "subset": "timit-test",
-        "github_url": "https://huggingface.co/Jubliano/wav2vec2-large-xls-r-300m-ipa-INTERNATIONAL1.5WithoutSpaces/tree/d5312009d8e620b183c334dfdd9ffc6b4f06f8c1",
-        "submission_date": "2024-12-05T10:17:21.334530"
-    },
-    {
-        "submission_id": "0fc29c54-3db2-46b6-aeee-c96484306751",
-        "submission_name": "xlsr 53 model",
-        "model": "facebook/wav2vec2-xlsr-53-espeak-cv-ft",
-        "average_per": 0.348845592557092,
-        "average_pwed": 0.1386742019529415,
-        "subset": "timit-test",
-        "github_url": "https://github.com/facebookresearch/fairseq/blob/main/examples/wav2vec/README.md",
-        "submission_date": "2024-12-05T10:34:26.157054"
-    },
-    {
-        "submission_id": "a23026ec-acac-4481-9761-f9368b4b94f1",
-        "submission_name": "ginic model wav2vec2 finetuned on buckeye",
-        "model": "ginic/hyperparam_tuning_1_wav2vec2-large-xlsr-buckeye-ipa",
-        "average_per": 0.2766466385175833,
-        "average_pwed": 0.10410683992600853,
-        "subset": "timit-test",
-        "github_url": "https://huggingface.co/ginic/vary_individuals_old_only_1_wav2vec2-large-xlsr-buckeye-ipa",
-        "submission_date": "2024-12-05T11:06:07.984825"
-    },
-    {
-        "submission_id": "e3bbf521-cc32-43a6-bf1c-5ddc6bce04ab",
-        "submission_name": "koel labs initial ",
-        "model": "KoelLabs/xlsr-timit-a0",
-        "average_per": 0.24242141955346685,
-        "average_pwed": 0.17395311976938,
-        "subset": "timit-test",
-        "github_url": "https://github.com/KoelLabs/ML/",
-        "submission_date": "2024-12-12T16:07:25.391145"
-    },
-    {
-        "submission_id": "02f223d4-7b98-4613-9377-19b74defe308",
-        "submission_name": "wav2vec2 ipa eng ",
-        "model": "snu-nia-12/wav2vec2-large_nia12_phone-ipa_english",
-        "average_per": 0.4847029843149011,
-        "average_pwed": 0.2072006544586948,
-        "subset": "timit-test",
-        "github_url": null,
-        "submission_date": "2024-12-18T22:01:20.855881"
-    },
-    {
-        "submission_id": "bed08468-42c7-459f-a46d-49ead50abfbc",
-        "submission_name": "fine-tuned version of facebook/wav2vec2-xls-r-300m on the Timit dataset",
-        "model": "vitouphy/wav2vec2-xls-r-300m-timit-phoneme",
-        "average_per": 0.2561961414705681,
-        "average_pwed": 0.1378394393452702,
-        "subset": "timit-test",
-        "github_url": "https://www.kaggle.com/code/vitouphy/phoneme-recognition-with-wav2vec2",
-        "submission_date": "2024-12-18T22:50:59.627338"
-    },
-    {
-        "submission_id": "4086072e-9368-442f-97cd-1fda6bf6656e",
-        "submission_name": "wav2vec2 model",
-        "model": "ctaguchi/wav2vec2-large-xlsr-japlmthufielta-ipa-plus-2000",
-        "average_per": 0.6479484324708775,
-        "average_pwed": 0.18710002665151734,
-        "subset": "timit-test",
-        "github_url": "https://huggingface.co/ctaguchi/wav2vec2-large-xlsr-japlmthufielta-ipa1000-ns",
-        "submission_date": "2024-12-18T23:29:27.322286"
-    },
-    {
-        "submission_id": "d0b2f8b4-20f8-45b4-b1a5-c81390d75b29",
-        "submission_name": "wav2vec2 non-english transcription",
-        "model": "ctaguchi/wav2vec2-large-xlsr-japlmthufielta-ipa1000-ns",
-        "average_per": 0.6417205190285036,
-        "average_pwed": 0.19048963968896404,
-        "subset": "timit-test",
-        "github_url": "https://huggingface.co/ctaguchi/wav2vec2-large-xlsr-japlmthufielta-ipa1000-ns",
-        "submission_date": "2024-12-19T07:41:18.135985"
-    },
-    {
-        "submission_id": "3bbb0f03-31a5-45b0-bde3-bbf574f19983",
-        "submission_name": "phonetic transcription with the Buckeye corpus, from xlsr-53 model",
-        "model": "ginic/gender_split_70_female_4_wav2vec2-large-xlsr-buckeye-ipa",
-        "average_per": 0.2810165988557621,
-        "average_pwed": 0.10703377161801164,
-        "subset": "timit-test",
-        "github_url": "https://github.com/ginic/multipa/tree/buckeye_experiments",
-        "submission_date": "2024-12-20T13:45:52.010575"
-    },
-    {
-        "submission_id": "2ed095f7-4712-4539-87b6-1e8588ac92a3",
-        "submission_name": "phonetic transcription",
-        "model": "Jubliano/wav2vec2-large-xls-r-300m-ipa-INTERNATIONAL1.9.2WithoutSpaces",
-        "average_per": 0.9537775908999574,
-        "average_pwed": 0.9351204819224959,
-        "subset": "timit-test",
-        "github_url": "https://huggingface.co/Jubliano/wav2vec2-large-xls-r-300m-ipa-INTERNATIONAL1.5WithoutSpaces",
-        "submission_date": "2024-12-20T14:21:32.293694"
-    },
-    {
-        "submission_id": "9cf02ce8-fc43-4d23-a8bb-b44e3116a93c",
-        "submission_name": "Jubliano xlsr model",
-        "model": "Jubliano/wav2vec2-large-xls-r-300m-ipa-nl",
-        "average_per": 0.9887075544197294,
-        "average_pwed": 0.9692486915717254,
-        "subset": "timit-test",
-        "github_url": "https://huggingface.co/Jubliano/wav2vec2-large-xls-r-300m-ipa-nl1.1",
-        "submission_date": "2024-12-20T15:40:51.632895"
-    },
-    {
-        "submission_id": "d5013845-f5c9-428a-8b39-7db066bb9f05",
-        "submission_name": "speech31 phoneme transcription english",
-        "model": "speech31/wavlm-large-english-ipa",
-        "average_per": 0.3694017596969614,
-        "average_pwed": 0.1356824900612308,
-        "subset": "timit-test",
-        "github_url": "https://huggingface.co/speech31/wavlm-large-english-ipa",
-        "submission_date": "2024-12-20T16:26:47.982209"
-    },
-    {
-        "submission_id": "362c788d-bc2e-427d-8c74-105f6235cf62",
-        "submission_name": "speech31 xlsr model",
-        "model": "speech31/XLS-R-300m-english-ipa",
-        "average_per": 0.36382554692045954,
-        "average_pwed": 0.1299702312124616,
-        "subset": "timit-test",
-        "github_url": "https://huggingface.co/speech31/XLS-R-300m-english-ipa",
-        "submission_date": "2024-12-20T16:47:54.826509"
-    },
-    {
-        "submission_id": "49e22782-0af1-4313-bc0c-60cb2f28d78f",
-        "submission_name": "model is a fine-tuned version of facebook/wav2vec2-large on the TIMIT dataset",
-        "model": "speech31/wav2vec2-large-english-TIMIT-phoneme_v3",
-        "average_per": 0.44563344149564776,
-        "average_pwed": 0.18844914029048124,
-        "subset": "timit-test",
-        "github_url": "https://huggingface.co/speech31/wav2vec2-large-english-TIMIT-phoneme_v3",
-        "submission_date": "2024-12-20T17:05:35.213738"
-    },
-    {
-        "submission_id": "26c04108-1131-435c-95f1-bb56b2aff06c",
-        "submission_name": "fine-tuned version of facebook/wav2vec2-large on the None dataset",
-        "model": "speech31/wav2vec2-large-TIMIT-IPA2",
-        "average_per": 0.4847029843149011,
-        "average_pwed": 0.2072006544586948,
-        "subset": "timit-test",
-        "github_url": "https://huggingface.co/speech31/wav2vec2-large-TIMIT-IPA2",
-        "submission_date": "2024-12-20T22:50:50.645178"
-    },
-    {
-        "submission_id": "4126d265-418f-4d11-8a29-4e69f064f1dd",
-        "submission_name": "ginic model, facebook/wav2vec2-large-xlsr-53 fine tuned",
-        "model": "ginic/vary_individuals_young_only_3_wav2vec2-large-xlsr-buckeye-ipa",
-        "average_per": 0.2807914104790719,
-        "average_pwed": 0.10494355278037441,
-        "subset": "timit-test",
-        "github_url": "https://huggingface.co/ginic/vary_individuals_young_only_3_wav2vec2-large-xlsr-buckeye-ipa",
-        "submission_date": "2024-12-21T01:31:04.862397"
-    }
-]

app/queue/results.json DELETED Viewed

@@ -1,1014 +0,0 @@
-[
-    {
-        "task_id": "721b4c64-a825-42d3-bb0a-bdff9ee1ed0f",
-        "model": "facebook/wav2vec2-lv-60-espeak-cv-ft",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.33667301260691423,
-        "average_pwed": 0.1276725657099669,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃiːhædjɚdɑːɹksuːɾɪnɡɹiːsiwɑːʃwɑːɾɚɹɑːljiː",
-                "per": 0.3939393939393939,
-                "pwed": 0.13888888888888887
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "doʊntæskmiːtəkæɹiɐnoɪliɹæɡlaɪkðæt",
-                "per": 0.32142857142857145,
-                "pwed": 0.13541666666666666
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæptənwʌzθɪnændhæɡɚdændhɪzbjuːɾɪfəlbuːtswɜːwɔːɹnændʃæbi",
-                "per": 0.3617021276595745,
-                "pwed": 0.13915094339622644
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðəɹiːzənzfɜːðɪsdaɪvsiːmdfuːlɪʃnaʊ",
-                "per": 0.20689655172413793,
-                "pwed": 0.022988505747126433
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "pɹədʌkʃənmeɪfɔːlfɑːɹbᵻloʊɛkspɛkteɪʃənz",
-                "per": 0.36363636363636365,
-                "pwed": 0.1392857142857143
-            }
-        ],
-        "timestamp": "2024-12-05T07:32:06.849017"
-    },
-    {
-        "task_id": "d6fe0956-b5b4-4105-835e-8dee1872ee4d",
-        "model": "KoelLabs/xlsr-timit-b0",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.12572285528714347,
-        "average_pwed": 0.06476636812791145,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃihædjɹdɑɹksuɾɪnɡɹisiwɑʃwɔɾɹʔɔljɪɹ",
-                "per": 0.12121212121212122,
-                "pwed": 0.037990196078431376
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "oʊnæskmitikæɹinɔɪliɹæɡlaɪkðæt",
-                "per": 0.14285714285714285,
-                "pwed": 0.10632183908045977
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæpinwəsθɪnhæɡɹdinizbjuɾiflbutswɹwɔɹninʃæbi",
-                "per": 0.10638297872340426,
-                "pwed": 0.0425531914893617
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðəɹiznzfɹðistaɪvsimdfuliʃnaʊ",
-                "per": 0.13793103448275862,
-                "pwed": 0.04166666666666667
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "pɹdʌkʃnmeɪfɔlfɑɹbloʊɛkspɛkeɪʃəns",
-                "per": 0.21212121212121213,
-                "pwed": 0.10858585858585859
-            }
-        ],
-        "timestamp": "2024-12-05T08:25:24.980111"
-    },
-    {
-        "task_id": "dbf4642a-fb13-402c-8a74-cc41fc4be599",
-        "model": "speech31/wav2vec2-large-TIMIT-IPA",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.4415425496841929,
-        "average_pwed": 0.18625930002594002,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃihædjʊrdɑrksutɪngrisiwɑʃwɔtərɔljɪrrrɪrɪrʃ",
-                "per": 0.5757575757575758,
-                "pwed": 0.25
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "doʊntæskmitɪkɛri��nɔɪliræglaɪkðəttm",
-                "per": 0.35714285714285715,
-                "pwed": 0.172979797979798
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæptɪnwɑzθɪnəndhægərdəndhɪzbjutəfəlbutswərwɔrnəndʃæbi",
-                "per": 0.40425531914893614,
-                "pwed": 0.17500000000000004
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðərizɪənzfərðɪstaɪvsimdfulɪʃnaʊaʊaʊ",
-                "per": 0.3793103448275862,
-                "pwed": 0.18928571428571428
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "prədəkʃənmeɪfɔlfɑrbɪloʊɛkspɛkteɪʃənzd",
-                "per": 0.3939393939393939,
-                "pwed": 0.13626126126126126
-            }
-        ],
-        "timestamp": "2024-12-05T09:36:14.568321"
-    },
-    {
-        "task_id": "912449a4-d7ed-4af4-b5be-5c2c57ec09ff",
-        "model": "Jubliano/wav2vec2-large-xls-r-300m-ipa-INTERNATIONAL1.5",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.6318471187460027,
-        "average_pwed": 0.222932144739126,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʒihɛldjydɑrksydənrisiwɑswadərɑlhir",
-                "per": 0.5454545454545454,
-                "pwed": 0.11764705882352941
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "dɑnraːstɪkmədəkaːrənoːjliralɪkaːn",
-                "per": 0.7857142857142857,
-                "pwed": 0.2341954022988506
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "xisʃktəʋɑstɪnɛnhɛɪɡərdɛnenzbjudəvɔlbutvɔːrʋɔrnənʃaːbi",
-                "per": 0.6595744680851063,
-                "pwed": 0.18382352941176472
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "dərizənsvərdəstajfzimtvuləsna",
-                "per": 0.6206896551724138,
-                "pwed": 0.11781609195402297
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "pːdkəmeːvɑlvɑrbəloɛkspɛkteːʃəns",
-                "per": 0.5454545454545454,
-                "pwed": 0.2171717171717172
-            }
-        ],
-        "timestamp": "2024-12-05T10:17:21.331572"
-    },
-    {
-        "task_id": "c79df17e-2bb2-4253-ae26-f7cc6ab21265",
-        "model": "facebook/wav2vec2-xlsr-53-espeak-cv-ft",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.348845592557092,
-        "average_pwed": 0.1386742019529415,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃiːhædjɚdksuːtɪnɡɹiːsiwɑːʃwɑːɾɚɑːljɪ",
-                "per": 0.48484848484848486,
-                "pwed": 0.21338383838383837
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "doːntæskmitəkæɹiənoɪliɹæɡlaɪkðæt",
-                "per": 0.32142857142857145,
-                "pwed": 0.12634408602150538
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæptənwʌzθɪnænhæɡɚdændhɪzbjuːɾɪfʊbuːtswɚwoːnəndʃæbi",
-                "per": 0.3617021276595745,
-                "pwed": 0.13095238095238093
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðəɹiːzənzfɚðəsdɑːvsiːmdfuːlɪʃnæ",
-                "per": 0.3793103448275862,
-                "pwed": 0.12068965517241376
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "pɹədʌkʃənmeɪfɑːlfɑːbəloʊɛkspɛkteɪʃənz",
-                "per": 0.36363636363636365,
-                "pwed": 0.14404761904761906
-            }
-        ],
-        "timestamp": "2024-12-05T10:34:26.154521"
-    },
-    {
-        "task_id": "f36060e6-a746-44dc-a527-54995b270053",
-        "model": "ginic/hyperparam_tuning_1_wav2vec2-large-xlsr-buckeye-ipa",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.2766466385175833,
-        "average_pwed": 0.10410683992600853,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃihædjɹ̩dɑɹksuɾɪnɡɹeɪsiwɑʃwɔɾɹ̩ɔljiɹ",
-                "per": 0.24242424242424243,
-                "pwed": 0.09926470588235292
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "doʊndæskmidɪkæɹiɛnɔɪliɹæɡlaɪkðæʔ",
-                "per": 0.32142857142857145,
-                "pwed": 0.14192708333333334
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæptɪnwʌzθɪnɛnhæɡɹ̩dɛnɪzbjuɾʌfl̩butswɹ̩wɔɹnɛnʃæbi",
-                "per": 0.2553191489361702,
-                "pwed": 0.05357142857142857
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðʌɹizʌnzfɹ̩ðʌstaɪvsimdfulɪʃnaʊ",
-                "per": 0.20689655172413793,
-                "pwed": 0.01293103448275862
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "pɹʌdʌkʃʌnmeɪfɔlfɑɹbʌloʊɛkspɛkteɪʃʌns",
-                "per": 0.2727272727272727,
-                "pwed": 0.10416666666666667
-            }
-        ],
-        "timestamp": "2024-12-05T11:06:07.981224"
-    },
-    {
-        "task_id": "47d56349-8111-4bda-a47f-e007dbedd36d",
-        "model": "KoelLabs/xlsr-timit-a0",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.24242141955346685,
-        "average_pwed": 0.17395311976938,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃihædjɹdɑɹksuɾɪnɡɹisiwɑʃwɔɾɹʔɔljɪɹ",
-                "per": 0.12121212121212122,
-                "pwed": 0.037990196078431376
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "ɪoʊnæskmitikæɹinɔɪliɹæɡlaɪkðt",
-                "per": 0.21428571428571427,
-                "pwed": 0.1695402298850575
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæpinwəsθɪninhæɡɹdinhizbjuɾiflbutswɹwɔɹnintʃæbi",
-                "per": 0.1276595744680851,
-                "pwed": 0.06499999999999999
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðəɹiznzfɹðistaɪ",
-                "per": 0.5862068965517241,
-                "pwed": 0.4899425287356322
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "ɹidʌkʃinmeɪfɔlfɑɹbəloʊɛkspɛkeɪ",
-                "per": 0.21212121212121213,
-                "pwed": 0.1553030303030303
-            }
-        ],
-        "timestamp": "2024-12-12T15:53:07.584096"
-    },
-    {
-        "task_id": "51dd5735-63bd-4fe5-a588-c0fc079076e0",
-        "model": "KoelLabs/xlsr-timit-a0",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.24242141955346685,
-        "average_pwed": 0.17395311976938,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃihædjɹdɑɹksuɾɪnɡɹisiwɑʃwɔɾɹʔɔljɪɹ",
-                "per": 0.12121212121212122,
-                "pwed": 0.037990196078431376
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "ɪoʊnæskmitikæɹinɔɪliɹæɡlaɪkðt",
-                "per": 0.21428571428571427,
-                "pwed": 0.1695402298850575
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæpinwəsθɪninhæɡɹdinhizbjuɾiflbutswɹwɔɹnintʃæbi",
-                "per": 0.1276595744680851,
-                "pwed": 0.06499999999999999
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðəɹiznzfɹðistaɪ",
-                "per": 0.5862068965517241,
-                "pwed": 0.4899425287356322
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "ɹidʌkʃinmeɪfɔlfɑɹbəloʊɛkspɛkeɪ",
-                "per": 0.21212121212121213,
-                "pwed": 0.1553030303030303
-            }
-        ],
-        "timestamp": "2024-12-12T16:07:25.389475"
-    },
-    {
-        "task_id": "2e592612-ca38-4afb-a6a0-3c870b288960",
-        "model": "snu-nia-12/wav2vec2-large_nia12_phone-ipa_english",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.4847029843149011,
-        "average_pwed": 0.2072006544586948,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃihædjʊrdɑrksutɪngrisiwɑʃwɔtərɔljɪrər",
-                "per": 0.42424242424242425,
-                "pwed": 0.15393518518518517
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "doʊntæskmitɪkɛriənɔɪliræglaɪkðətdoʊndt",
-                "per": 0.5,
-                "pwed": 0.2623873873873874
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæptənwɑzθɪnəndhægərdəndhɪzbjutəfəlbutswərwɔrnəndʃæbiiii",
-                "per": 0.46808510638297873,
-                "pwed": 0.2191091954022989
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðərizənzfərðɪstaɪvsimdfulɪʃnaʊ",
-                "per": 0.20689655172413793,
-                "pwed": 0.054166666666666675
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "prədəkʃənmeɪfɔlfɑrbɪloʊɛkspɛkteɪʃənzpzppppzpdtdtd",
-                "per": 0.7272727272727273,
-                "pwed": 0.34438775510204084
-            }
-        ],
-        "timestamp": "2024-12-18T22:01:20.853274"
-    },
-    {
-        "task_id": "d38e65ce-75b5-4dbf-8ade-bff6a5803790",
-        "model": "vitouphy/wav2vec2-xls-r-300m-timit-phoneme",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.2561961414705681,
-        "average_pwed": 0.1378394393452702,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃihædjɝdɑɹksuɾɪngɹisiwɑʃwɑɾɝɑljiɝ",
-                "per": 0.18181818181818182,
-                "pwed": 0.13257575757575757
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "doʊnæskmitɪkæɹiɪnɔɪliɹæglaɪkðæ",
-                "per": 0.21428571428571427,
-                "pwed": 0.10919540229885057
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkætɪnwəsθɪnənhægɝdɪnɪzbjuɾɪflbutswɝwɑɹnɪnʃæbi",
-                "per": 0.19148936170212766,
-                "pwed": 0.0576241134751773
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðɪɹizənzfɝðɪsdaɪvsimdfulɪʃnaʊ",
-                "per": 0.10344827586206896,
-                "pwed": 0.03735632183908046
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "pɹɝdəkʃɪnmeɪfɑlfɹbloʊɛkspɛteɪʃɪns",
-                "per": 0.3333333333333333,
-                "pwed": 0.12373737373737376
-            }
-        ],
-        "timestamp": "2024-12-18T22:50:59.625872"
-    },
-    {
-        "task_id": "2839c0c6-8f3b-426e-9eb7-04b6e133dc47",
-        "model": "ctaguchi/wav2vec2-large-xlsr-japlmthufielta-ipa-plus-2000",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.6479484324708775,
-        "average_pwed": 0.18710002665151734,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʂixadjodarksyːdɨnɡwisiwaːʃwarɒɔjiːr",
-                "per": 0.6060606060606061,
-                "pwed": 0.15404040404040406
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "dondaːskmiːdɨkɛːɻjɒnojluiʋɻaːɡlɑjɡtaːn",
-                "per": 0.8928571428571429,
-                "pwed": 0.2146464646464646
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hizkaːptanustinanhagɛɻdɛnizbiurufubutswuɾʋoːɻninʂaːbi",
-                "per": 0.5106382978723404,
-                "pwed": 0.1096938775510204
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðrisɔnsfrdɔsdaːjvsimtfulɛʂnɛ",
-                "per": 0.5172413793103449,
-                "pwed": 0.11063218390804598
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "pɛdakɕɔnmɛjfaɔfarbuwɔwɛkspɛktajʂɔnt͡s",
-                "per": 0.7272727272727273,
-                "pwed": 0.15
-            }
-        ],
-        "timestamp": "2024-12-18T23:29:27.320433"
-    },
-    {
-        "task_id": "59afc37a-0072-44dd-a02a-0cf47d89c120",
-        "model": "ctaguchi/wav2vec2-large-xlsr-japlmthufielta-ipa1000-ns",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.6417205190285036,
-        "average_pwed": 0.19048963968896404,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʂiharjoɖarksɯudenɡwisiwaːʂwarɔːjiːr",
-                "per": 0.696969696969697,
-                "pwed": 0.20580808080808083
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "dɔndaːskmidɨkaːɻjɑno̞jwɯräːɡläikθaːn",
-                "per": 0.8214285714285714,
-                "pwed": 0.17338709677419356
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "çizkatːɛnwɔstinanhaːɡɛɾdanɨzbirufubuswɔwoːɾnenʂaːbi",
-                "per": 0.5531914893617021,
-                "pwed": 0.1276595744680851
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðɔriːzɔnsfɾdɔɕtaːivsimtfuøʃnɛu",
-                "per": 0.5862068965517241,
-                "pwed": 0.08764367816091957
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "pɾɔdakʂɔnmɛjfaɔfaɾbuwɔuwɛkspɛktajʂons",
-                "per": 0.7575757575757576,
-                "pwed": 0.18806306306306303
-            }
-        ],
-        "timestamp": "2024-12-19T07:41:18.132953"
-    },
-    {
-        "task_id": "5517f6b2-6a76-4a2d-a6ce-33446f390c3b",
-        "model": "ginic/gender_split_70_female_4_wav2vec2-large-xlsr-buckeye-ipa",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.2810165988557621,
-        "average_pwed": 0.10703377161801164,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃihædjɹ̩dɑɹksudɪnɡɹisiwɑʃwɑɾɹ̩ɔljiɹ",
-                "per": 0.18181818181818182,
-                "pwed": 0.07196969696969698
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "doʊndæskmitɪkæɹiʌnɔɪliɹæɡlaɪkðæʔ",
-                "per": 0.2857142857142857,
-                "pwed": 0.14062500000000003
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæptʌnwʌzθɪnhæɡɹ̩dɛnɪzbjuɾʌfl̩butswɹ̩wɔʊɹnɪnʃæbi",
-                "per": 0.2978723404255319,
-                "pwed": 0.09114583333333333
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðʌɹizʌnzfɹ̩ðʌstaɪvsimtfulɪʃnaʊ",
-                "per": 0.2413793103448276,
-                "pwed": 0.014367816091954023
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "pɹʌdʌkʃʌnmeɪfɔlfɑɹbʌloʊɛkspɛkteɪʃʌnz",
-                "per": 0.30303030303030304,
-                "pwed": 0.10532407407407407
-            }
-        ],
-        "timestamp": "2024-12-20T13:45:52.009233"
-    },
-    {
-        "task_id": "c2139f96-e79e-4f25-a525-aa039f65555f",
-        "model": "Jubliano/wav2vec2-large-xls-r-300m-ipa-INTERNATIONAL1.9.2WithoutSpaces",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.9537775908999574,
-        "average_pwed": 0.9351204819224959,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "iɛ2",
-                "per": 0.9696969696969697,
-                "pwed": 0.9406565656565656
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "iɛ2",
-                "per": 0.9285714285714286,
-                "pwed": 0.9285714285714286
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "iɛ2",
-                "per": 0.9787234042553191,
-                "pwed": 0.9583333333333333
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "iɛ2",
-                "per": 0.9655172413793104,
-                "pwed": 0.932471264367816
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "iɛ2",
-                "per": 0.9696969696969697,
-                "pwed": 0.9406565656565656
-            }
-        ],
-        "timestamp": "2024-12-20T14:21:32.290889"
-    },
-    {
-        "task_id": "d146f1f1-6e6e-4b28-9420-c652ae9a1002",
-        "model": "Jubliano/wav2vec2-large-xls-r-300m-ipa-nl",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.9887075544197294,
-        "average_pwed": 0.9692486915717254,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "p",
-                "per": 1.0,
-                "pwed": 0.9747474747474747
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "p",
-                "per": 1.0,
-                "pwed": 0.96875
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "p",
-                "per": 0.9787234042553191,
-                "pwed": 0.9787234042553191
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "p",
-                "per": 1.0,
-                "pwed": 0.9683908045977011
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "p",
-                "per": 0.9696969696969697,
-                "pwed": 0.9696969696969697
-            }
-        ],
-        "timestamp": "2024-12-20T15:26:27.658798"
-    },
-    {
-        "task_id": "265c5859-e7ba-492d-a6c9-45733dc17c99",
-        "model": "Jubliano/wav2vec2-large-xls-r-300m-ipa-nl",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.9887075544197294,
-        "average_pwed": 0.9692486915717254,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "p",
-                "per": 1.0,
-                "pwed": 0.9747474747474747
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "p",
-                "per": 1.0,
-                "pwed": 0.96875
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "p",
-                "per": 0.9787234042553191,
-                "pwed": 0.9787234042553191
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "p",
-                "per": 1.0,
-                "pwed": 0.9683908045977011
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "p",
-                "per": 0.9696969696969697,
-                "pwed": 0.9696969696969697
-            }
-        ],
-        "timestamp": "2024-12-20T15:40:51.631218"
-    },
-    {
-        "task_id": "e297dfde-95e5-462b-a6e5-8fa43bc30bc0",
-        "model": "speech31/wavlm-large-english-ipa",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.3694017596969614,
-        "average_pwed": 0.1356824900612308,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃihædjɔɹdɑɹksutɪnɡɹisiwɑʃwɔtɹ̩ɔljɪɹ",
-                "per": 0.2727272727272727,
-                "pwed": 0.11274509803921567
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "dɑntæskmitəkæɹiænojliɹæɡlajkðæt",
-                "per": 0.39285714285714285,
-                "pwed": 0.13575268817204303
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæpptənwɑzθɪændhæɡɹ̩dænhɪzbjutəfəlbutswɹ̩wɔɹnɪnʃæbi",
-                "per": 0.3404255319148936,
-                "pwed": 0.12980769230769232
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðəɹizənzfɔɹðəsdajvsimdfulɪʃnaw",
-                "per": 0.20689655172413793,
-                "pwed": 0.051388888888888894
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "pɹədʌkʃənmejffɔlfɔɑɹbɪlowɪkspɛktejʃənz",
-                "per": 0.45454545454545453,
-                "pwed": 0.16666666666666666
-            }
-        ],
-        "timestamp": "2024-12-20T16:13:24.050232"
-    },
-    {
-        "task_id": "efe95f71-05e3-485d-8e0c-1823a3037cf4",
-        "model": "speech31/wavlm-large-english-ipa",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.3694017596969614,
-        "average_pwed": 0.1356824900612308,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃihædjɔɹdɑɹksutɪnɡɹisiwɑʃwɔtɹ̩ɔljɪɹ",
-                "per": 0.2727272727272727,
-                "pwed": 0.11274509803921567
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "dɑntæskmitəkæɹiænojliɹæɡlajkðæt",
-                "per": 0.39285714285714285,
-                "pwed": 0.13575268817204303
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæpptənwɑzθɪændhæɡɹ̩dænhɪzbjutəfəlbutswɹ̩wɔɹnɪnʃæbi",
-                "per": 0.3404255319148936,
-                "pwed": 0.12980769230769232
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðəɹizənzfɔɹðəsdajvsimdfulɪʃnaw",
-                "per": 0.20689655172413793,
-                "pwed": 0.051388888888888894
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "pɹədʌkʃənmejffɔlfɔɑɹbɪlowɪkspɛktejʃənz",
-                "per": 0.45454545454545453,
-                "pwed": 0.16666666666666666
-            }
-        ],
-        "timestamp": "2024-12-20T16:26:47.980084"
-    },
-    {
-        "task_id": "4b2ae2fc-fe2f-4f8b-9e8f-25c0bae13c0d",
-        "model": "speech31/XLS-R-300m-english-ipa",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.36382554692045954,
-        "average_pwed": 0.1299702312124616,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃihædjɔɹdɑɹksutɪnɡɹisiwɑʃwɔtɹ̩ɔljɪɹ",
-                "per": 0.2727272727272727,
-                "pwed": 0.11274509803921567
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "dɑntæskmitəkæɹiænojliɹæɡlajkðæt",
-                "per": 0.39285714285714285,
-                "pwed": 0.13575268817204303
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæmptənwɑzθɪnændhæɡɹ̩dɪndhɪzbjutəfəlbutswɹ̩wɔɹnɪnʃæbi",
-                "per": 0.3404255319148936,
-                "pwed": 0.14583333333333334
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðəɹɛzənzfɔɹðɪstajvsimdfulɪʃnaw",
-                "per": 0.2413793103448276,
-                "pwed": 0.052777777777777785
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "pɹədʌkʃənmejfɔlfɑɹbɪlowɛkspɛktejʃənz",
-                "per": 0.3939393939393939,
-                "pwed": 0.11921296296296297
-            }
-        ],
-        "timestamp": "2024-12-20T16:47:54.824174"
-    },
-    {
-        "task_id": "33d387c0-703c-415d-b8e2-81cea87a2146",
-        "model": "speech31/wav2vec2-large-english-TIMIT-phoneme_v3",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.44563344149564776,
-        "average_pwed": 0.18844914029048124,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃihædjʊrdɑrksutɪngrisiwɑʃwɔtərɔljɪrr",
-                "per": 0.3939393939393939,
-                "pwed": 0.12976190476190474
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "doʊntæskmitɪkɛriənɔɪliræglaɪkðətdnt",
-                "per": 0.39285714285714285,
-                "pwed": 0.19730392156862747
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæptənwɑzθɪnəndhægərdəndhɪzbjutəfəlbutswərwɔrnɪnʃæbibæb",
-                "per": 0.44680851063829785,
-                "pwed": 0.20394736842105265
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðərizənzfərðɪsstaɪvsimdfulɪʃnaʊa",
-                "per": 0.27586206896551724,
-                "pwed": 0.11328125
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "prədəkʃənmeɪfɔlfɑrbɪloʊɛkspɛkteɪʃənzd",
-                "per": 0.3939393939393939,
-                "pwed": 0.13626126126126126
-            }
-        ],
-        "timestamp": "2024-12-20T17:05:35.210786"
-    },
-    {
-        "task_id": "c89bcefc-3884-435a-a54c-24297fe6f041",
-        "model": "speech31/wav2vec2-large-TIMIT-IPA2",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.4847029843149011,
-        "average_pwed": 0.2072006544586948,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃihædjʊrdɑrksutɪngrisiwɑʃwɔtərɔljɪrər",
-                "per": 0.42424242424242425,
-                "pwed": 0.15393518518518517
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "doʊntæskmitɪkɛriənɔɪliræglaɪkðətdoʊndt",
-                "per": 0.5,
-                "pwed": 0.2623873873873874
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæptənwɑzθɪnəndhægərdəndhɪzbjutəfəlbutswərwɔrnəndʃæbiiii",
-                "per": 0.46808510638297873,
-                "pwed": 0.2191091954022989
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðərizənzfərðɪstaɪvsimdfulɪʃnaʊ",
-                "per": 0.20689655172413793,
-                "pwed": 0.054166666666666675
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "prədəkʃənmeɪfɔlfɑrbɪloʊɛkspɛkteɪʃənzpzppppzpdtdtd",
-                "per": 0.7272727272727273,
-                "pwed": 0.34438775510204084
-            }
-        ],
-        "timestamp": "2024-12-20T22:50:50.641790"
-    },
-    {
-        "task_id": "81fa94f8-94ae-4601-952c-24abaddaf691",
-        "model": "ginic/vary_individuals_young_only_3_wav2vec2-large-xlsr-buckeye-ipa",
-        "subset": "timit-test",
-        "num_files": 1680,
-        "average_per": 0.2807914104790719,
-        "average_pwed": 0.10494355278037441,
-        "detailed_results": [
-            {
-                "file": "data/TEST/DR1/FAKS0/SA1.WAV",
-                "ground_truth": "ʃihædjɹdɑɹksuɾɪŋgɹisiwɑʃwɑɾɹʔɔljiɹ",
-                "prediction": "ʃihædjɹdɑɹksuɾɪnɡɹisiwɔʃwɔɾɹ̩ɔljiɹ",
-                "per": 0.18181818181818182,
-                "pwed": 0.0744949494949495
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SA2.WAV",
-                "ground_truth": "oʊnæsmitikɛɹiinɔɪliɹæglaɪkðæt",
-                "prediction": "doʊndæskmidɪkæɹiɪnɔɪliɹæɡlaɪkðæʔ",
-                "per": 0.32142857142857145,
-                "pwed": 0.140625
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI1573.WAV",
-                "ground_truth": "hɪzkæpinwəsθɪnænhægɹdinɪzbjuɾuflbutswɹwɔɹninʃæbi",
-                "prediction": "hɪzkæptʌnwʌzθɪnɛnhæɡɹ̩dɛnɪzbjuɾʌfl̩butswɹ̩wɔɹnɪnʃæbi",
-                "per": 0.2553191489361702,
-                "pwed": 0.05357142857142856
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI2203.WAV",
-                "ground_truth": "ðiɹizənzfɹðɪsdaɪvsimdfuliʃnaʊ",
-                "prediction": "ðʌɹizʌn̩zfɹðʌstaɪvsimtfulɪʃnaʊ",
-                "per": 0.2413793103448276,
-                "pwed": 0.014367816091954023
-            },
-            {
-                "file": "data/TEST/DR1/FAKS0/SI943.WAV",
-                "ground_truth": "ɹdʌkʃinmeɪfɔlfɑɹbəloʊəkspikeɪʃnts",
-                "prediction": "pɹʌdʌkʃn̩meɪfɔlfɑɹbʌloʊɛkspɛkteɪʃʌns",
-                "per": 0.30303030303030304,
-                "pwed": 0.12023809523809523
-            }
-        ],
-        "timestamp": "2024-12-21T01:31:04.859070"
-    }
-]

app/queue/tasks.json DELETED Viewed

@@ -1,237 +0,0 @@
-[
-    {
-        "id": "721b4c64-a825-42d3-bb0a-bdff9ee1ed0f",
-        "model": "facebook/wav2vec2-lv-60-espeak-cv-ft",
-        "subset": "timit-test",
-        "submission_name": "facebook espeak",
-        "github_url": "https://github.com/facebookresearch/fairseq/blob/main/examples/wav2vec/README.md",
-        "status": "completed",
-        "submitted_at": "2024-12-05T07:19:03.076292"
-    },
-    {
-        "id": "d6fe0956-b5b4-4105-835e-8dee1872ee4d",
-        "model": "KoelLabs/xlsr-timit-b0",
-        "subset": "timit-test",
-        "submission_name": "english phoneme model",
-        "github_url": "https://github.com/KoelLabs/",
-        "status": "completed",
-        "submitted_at": "2024-12-05T08:12:40.161444"
-    },
-    {
-        "id": "dbf4642a-fb13-402c-8a74-cc41fc4be599",
-        "model": "speech31/wav2vec2-large-TIMIT-IPA",
-        "subset": "timit-test",
-        "submission_name": "speech 31 model",
-        "github_url": "https://huggingface.co/speech31/wav2vec2-large-TIMIT-IPA2",
-        "status": "completed",
-        "submitted_at": "2024-12-05T09:13:45.315361"
-    },
-    {
-        "id": "4e3b80be-b255-47f2-b4ae-18a12e232e8a",
-        "model": "Jubliano/wav2vec2-large-xls-r-300m-ipa-INTERNATIONAL1.5",
-        "subset": "timit-test",
-        "submission_name": "Jubliano model",
-        "github_url": "https://huggingface.co/Jubliano/wav2vec2-large-xls-r-300m-ipa-INTERNATIONAL1.5WithoutSpaces/tree/d5312009d8e620b183c334dfdd9ffc6b4f06f8c1",
-        "status": "processing",
-        "submitted_at": "2024-12-05T09:36:14.571930"
-    },
-    {
-        "id": "912449a4-d7ed-4af4-b5be-5c2c57ec09ff",
-        "model": "Jubliano/wav2vec2-large-xls-r-300m-ipa-INTERNATIONAL1.5",
-        "subset": "timit-test",
-        "submission_name": "jubiliano model wav2vec2",
-        "github_url": "https://huggingface.co/Jubliano/wav2vec2-large-xls-r-300m-ipa-INTERNATIONAL1.5WithoutSpaces/tree/d5312009d8e620b183c334dfdd9ffc6b4f06f8c1",
-        "status": "completed",
-        "submitted_at": "2024-12-05T10:01:40.502935"
-    },
-    {
-        "id": "c79df17e-2bb2-4253-ae26-f7cc6ab21265",
-        "model": "facebook/wav2vec2-xlsr-53-espeak-cv-ft",
-        "subset": "timit-test",
-        "submission_name": "xlsr 53 model",
-        "github_url": "https://github.com/facebookresearch/fairseq/blob/main/examples/wav2vec/README.md",
-        "status": "completed",
-        "submitted_at": "2024-12-05T10:18:37.408664"
-    },
-    {
-        "id": "f36060e6-a746-44dc-a527-54995b270053",
-        "model": "ginic/hyperparam_tuning_1_wav2vec2-large-xlsr-buckeye-ipa",
-        "subset": "timit-test",
-        "submission_name": "ginic model wav2vec2 finetuned on buckeye",
-        "github_url": "https://huggingface.co/ginic/vary_individuals_old_only_1_wav2vec2-large-xlsr-buckeye-ipa",
-        "status": "completed",
-        "submitted_at": "2024-12-05T10:36:02.340422"
-    },
-    {
-        "id": "abf6c247-9faf-46ef-b0fa-25f2669da922",
-        "model": "KoelLabs/xlsr-timit-a0",
-        "subset": "timit-test",
-        "submission_name": "Koel Labs early version of finetuned model ",
-        "github_url": "https://github.com/KoelLabs/ML",
-        "status": "processing",
-        "submitted_at": "2024-12-05T11:08:23.663553"
-    },
-    {
-        "id": "47d56349-8111-4bda-a47f-e007dbedd36d",
-        "model": "KoelLabs/xlsr-timit-a0",
-        "subset": "timit-test",
-        "submission_name": "koel labs initial ",
-        "github_url": "https://github.com/KoelLabs/ML/",
-        "status": "completed",
-        "submitted_at": "2024-12-12T15:28:12.923626"
-    },
-    {
-        "id": "51dd5735-63bd-4fe5-a588-c0fc079076e0",
-        "model": "KoelLabs/xlsr-timit-a0",
-        "subset": "timit-test",
-        "submission_name": "koel labs initial ",
-        "github_url": "https://github.com/KoelLabs/ML/",
-        "status": "completed",
-        "submitted_at": "2024-12-12T15:53:07.620070"
-    },
-    {
-        "id": "2e592612-ca38-4afb-a6a0-3c870b288960",
-        "model": "snu-nia-12/wav2vec2-large_nia12_phone-ipa_english",
-        "subset": "timit-test",
-        "submission_name": "wav2vec2 ipa eng ",
-        "github_url": "",
-        "status": "completed",
-        "submitted_at": "2024-12-18T21:41:21.861322"
-    },
-    {
-        "id": "ac4cbe86-4dbe-4929-8f76-4d2052e0acf1",
-        "model": "vitouphy/wav2vec2-xls-r-300m-timit-phoneme",
-        "subset": "timit-test",
-        "submission_name": "fine-tuned version of facebook/wav2vec2-xls-r-300m on the Timit dataset",
-        "github_url": "https://www.kaggle.com/code/vitouphy/phoneme-recognition-with-wav2vec2",
-        "status": "processing",
-        "submitted_at": "2024-12-18T22:09:03.412372"
-    },
-    {
-        "id": "d38e65ce-75b5-4dbf-8ade-bff6a5803790",
-        "model": "vitouphy/wav2vec2-xls-r-300m-timit-phoneme",
-        "subset": "timit-test",
-        "submission_name": "fine-tuned version of facebook/wav2vec2-xls-r-300m on the Timit dataset",
-        "github_url": "https://www.kaggle.com/code/vitouphy/phoneme-recognition-with-wav2vec2",
-        "status": "completed",
-        "submitted_at": "2024-12-18T22:19:46.817373"
-    },
-    {
-        "id": "2839c0c6-8f3b-426e-9eb7-04b6e133dc47",
-        "model": "ctaguchi/wav2vec2-large-xlsr-japlmthufielta-ipa-plus-2000",
-        "subset": "timit-test",
-        "submission_name": "wav2vec2 model",
-        "github_url": "https://huggingface.co/ctaguchi/wav2vec2-large-xlsr-japlmthufielta-ipa1000-ns",
-        "status": "completed",
-        "submitted_at": "2024-12-18T22:55:36.734691"
-    },
-    {
-        "id": "59afc37a-0072-44dd-a02a-0cf47d89c120",
-        "model": "ctaguchi/wav2vec2-large-xlsr-japlmthufielta-ipa1000-ns",
-        "subset": "timit-test",
-        "submission_name": "wav2vec2 non-english transcription",
-        "github_url": "https://huggingface.co/ctaguchi/wav2vec2-large-xlsr-japlmthufielta-ipa1000-ns",
-        "status": "completed",
-        "submitted_at": "2024-12-18T23:47:03.488337"
-    },
-    {
-        "id": "e57eda9d-7a1d-4b41-9d47-a3d3839cac8b",
-        "model": "ginic/gender_split_70_female_4_wav2vec2-large-xlsr-buckeye-ipa",
-        "subset": "timit-test",
-        "submission_name": "phonetic transcription with the Buckeye corpus, from xlsr-53 model ",
-        "github_url": "https://github.com/ginic/multipa/tree/buckeye_experiments",
-        "status": "failed",
-        "submitted_at": "2024-12-19T11:48:26.415322",
-        "error": "Evaluation failed: (MaxRetryError(\"HTTPSConnectionPool(host='cdn-lfs-us-1.hf.co', port=443): Max retries exceeded with url: /repos/a4/b1/a4b11f4627350048e021a84d10b89320db54e02c54b2a9366228f8a05cda220b/120f5bc04d1df15143033c93e3ef358981775b529f17e0db11e58a1b80754e67?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27model.safetensors%3B+filename%3D%22model.safetensors%22%3B&Expires=1734889736&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTczNDg4OTczNn19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmhmLmNvL3JlcG9zL2E0L2IxL2E0YjExZjQ2MjczNTAwNDhlMDIxYTg0ZDEwYjg5MzIwZGI1NGUwMmM1NGIyYTkzNjYyMjhmOGEwNWNkYTIyMGIvMTIwZjViYzA0ZDFkZjE1MTQzMDMzYzkzZTNlZjM1ODk4MTc3NWI1MjlmMTdlMGRiMTFlNThhMWI4MDc1NGU2Nz9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSoifV19&Signature=kfPD6ymEJuVvFZyuN3qL3xk4YJlpI5dqHgON4wJY-Mppwlp6x4Dw7cWdjEkJvMRF-bDuzNWQ3BEJPbsYouVW9WZMucDmxo38UwxSzIBhfWQxCYiHdUWuQPkypDUkI1mR3vbnCFQFXLiMQ2CgwWQz7q66OjIyq3suA00mhL2WcL8wvtovrfoEOkboEXCHCNLprfpoHpfoyfo~VS9~kmm61GN6SWbc9lzASIuT5FLkn~BJ6h405MgutQpNvrR4SHVLftk7rBmY8TAB3re5D0-9qFrMYb2Tk~9RKT3nxSNbgZVcEXzA5rYskcuGsrHoTuTTZ-NSW69K2M0IeivzFWTLNQ__&Key-Pair-Id=K24J24Z295AEI9 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x280544190>: Failed to establish a new connection: [Errno 51] Network is unreachable'))\"), '(Request ID: 14c9cc7c-47ee-47ae-b473-f4add807d233)')"
-    },
-    {
-        "id": "5517f6b2-6a76-4a2d-a6ce-33446f390c3b",
-        "model": "ginic/gender_split_70_female_4_wav2vec2-large-xlsr-buckeye-ipa",
-        "subset": "timit-test",
-        "submission_name": "phonetic transcription with the Buckeye corpus, from xlsr-53 model",
-        "github_url": "https://github.com/ginic/multipa/tree/buckeye_experiments",
-        "status": "completed",
-        "submitted_at": "2024-12-20T13:29:37.327317"
-    },
-    {
-        "id": "c2139f96-e79e-4f25-a525-aa039f65555f",
-        "model": "Jubliano/wav2vec2-large-xls-r-300m-ipa-INTERNATIONAL1.9.2WithoutSpaces",
-        "subset": "timit-test",
-        "submission_name": "phonetic transcription",
-        "github_url": "https://huggingface.co/Jubliano/wav2vec2-large-xls-r-300m-ipa-INTERNATIONAL1.5WithoutSpaces",
-        "status": "completed",
-        "submitted_at": "2024-12-20T14:01:35.626112"
-    },
-    {
-        "id": "d146f1f1-6e6e-4b28-9420-c652ae9a1002",
-        "model": "Jubliano/wav2vec2-large-xls-r-300m-ipa-nl",
-        "subset": "timit-test",
-        "submission_name": "Jubliano xlsr model",
-        "github_url": "https://huggingface.co/Jubliano/wav2vec2-large-xls-r-300m-ipa-nl1.1",
-        "status": "completed",
-        "submitted_at": "2024-12-20T15:08:45.949389"
-    },
-    {
-        "id": "265c5859-e7ba-492d-a6c9-45733dc17c99",
-        "model": "Jubliano/wav2vec2-large-xls-r-300m-ipa-nl",
-        "subset": "timit-test",
-        "submission_name": "Jubliano xlsr model",
-        "github_url": "https://huggingface.co/Jubliano/wav2vec2-large-xls-r-300m-ipa-nl1.1",
-        "status": "completed",
-        "submitted_at": "2024-12-20T15:26:27.706187"
-    },
-    {
-        "id": "e297dfde-95e5-462b-a6e5-8fa43bc30bc0",
-        "model": "speech31/wavlm-large-english-ipa",
-        "subset": "timit-test",
-        "submission_name": "speech31 phoneme transcription english",
-        "github_url": "https://huggingface.co/speech31/wavlm-large-english-ipa",
-        "status": "completed",
-        "submitted_at": "2024-12-20T15:56:25.445806"
-    },
-    {
-        "id": "efe95f71-05e3-485d-8e0c-1823a3037cf4",
-        "model": "speech31/wavlm-large-english-ipa",
-        "subset": "timit-test",
-        "submission_name": "speech31 phoneme transcription english",
-        "github_url": "https://huggingface.co/speech31/wavlm-large-english-ipa",
-        "status": "completed",
-        "submitted_at": "2024-12-20T16:13:24.099308"
-    },
-    {
-        "id": "4b2ae2fc-fe2f-4f8b-9e8f-25c0bae13c0d",
-        "model": "speech31/XLS-R-300m-english-ipa",
-        "subset": "timit-test",
-        "submission_name": "speech31 xlsr model",
-        "github_url": "https://huggingface.co/speech31/XLS-R-300m-english-ipa",
-        "status": "completed",
-        "submitted_at": "2024-12-20T16:33:23.864360"
-    },
-    {
-        "id": "33d387c0-703c-415d-b8e2-81cea87a2146",
-        "model": "speech31/wav2vec2-large-english-TIMIT-phoneme_v3",
-        "subset": "timit-test",
-        "submission_name": "model is a fine-tuned version of facebook/wav2vec2-large on the TIMIT dataset",
-        "github_url": "https://huggingface.co/speech31/wav2vec2-large-english-TIMIT-phoneme_v3",
-        "status": "completed",
-        "submitted_at": "2024-12-20T16:52:07.883839"
-    },
-    {
-        "id": "c89bcefc-3884-435a-a54c-24297fe6f041",
-        "model": "speech31/wav2vec2-large-TIMIT-IPA2",
-        "subset": "timit-test",
-        "submission_name": "fine-tuned version of facebook/wav2vec2-large on the None dataset",
-        "github_url": "https://huggingface.co/speech31/wav2vec2-large-TIMIT-IPA2",
-        "status": "completed",
-        "submitted_at": "2024-12-20T21:54:38.559569"
-    },
-    {
-        "id": "81fa94f8-94ae-4601-952c-24abaddaf691",
-        "model": "ginic/vary_individuals_young_only_3_wav2vec2-large-xlsr-buckeye-ipa",
-        "subset": "timit-test",
-        "submission_name": "ginic model, facebook/wav2vec2-large-xlsr-53 fine tuned",
-        "github_url": "https://huggingface.co/ginic/vary_individuals_young_only_3_wav2vec2-large-xlsr-buckeye-ipa",
-        "status": "completed",
-        "submitted_at": "2024-12-21T01:15:41.870875"
-    }
-]

app/tasks.py CHANGED Viewed

@@ -1,224 +1,117 @@
-# This modules handles the task queue, results, and leaderboard storage.
-import json
-import uuid
 from datetime import datetime
-from pathlib import Path
-from typing import Optional
-import asyncio
-import pandas as pd
-from inference import evaluate_model
-# Get absolute path
-CURRENT_DIR = Path(__file__).parent.absolute()
-# Constants
-QUEUE_DIR = CURRENT_DIR / "queue"
-PATHS = {
-    "tasks": QUEUE_DIR / "tasks.json",
-    "results": QUEUE_DIR / "results.json",
-    "leaderboard": QUEUE_DIR / "leaderboard.json",
-}
-# Handle storing and loading data from JSON files
-class StorageManager:
-    """Handles all JSON storage operations"""
-    def __init__(self, paths: dict[str, Path]):
-        self.paths = paths
-        self._ensure_directories()
-    def _ensure_directories(self):
-        """Ensure all necessary directories and files exist"""
-        for path in self.paths.values():
-            path.parent.mkdir(parents=True, exist_ok=True)
-            if not path.exists():
-                path.write_text("[]")
-    def load(self, key: str) -> list:
-        """Load JSON file"""
-        return json.loads(self.paths[key].read_text())
-    def save(self, key: str, data: list):
-        """Save data to JSON file"""
-        self.paths[key].write_text(
-            json.dumps(data, indent=4, default=str, ensure_ascii=False)
-        )
-    def update_task(self, task_id: str, updates: dict):
-        """Update specific task with new data"""
-        tasks = self.load("tasks")
-        for task in tasks:
-            if task["id"] == task_id:
-                task.update(updates)
-                break
-        self.save("tasks", tasks)
-# Initialize storage manager
-storage_manager = StorageManager(PATHS)
-# Export external functions
-def get_leaderboard_data():
-    """Return leaderboard data as DataFrame"""
-    try:
-        return pd.DataFrame(storage_manager.load("leaderboard"))
-    except Exception as e:
-        print(f"Error loading leaderboard: {e}")
-        return pd.DataFrame()
-def get_results():
-    """Return list of evaluation results"""
-    return storage_manager.load("results")
-def get_tasks():
-    """Return list of tasks"""
-    return storage_manager.load("tasks")
-def get_status(query: str) -> dict:
-    """Check status of a model evaluation task_id or model_name"""
-    if not query:
-        return {"error": "Please enter a model name or task ID"}
-    try:
-        results = get_results()
-        tasks = get_tasks()
-        # First try to find by task ID
-        result = next((r for r in results if r["task_id"] == query), None)
-        task = next((t for t in tasks if t["id"] == query), None)
-        # If not found, try to find by model name
-        if not result:
-            result = next((r for r in results if r["model"] == query), None)
-        if not task:
-            task = next((t for t in tasks if t["model"] == query), None)
-        if result:
-            # If we found results, return them
-            return {
-                "status": "completed",
-                "model": result["model"],
-                "subset": result["subset"],
-                "num_files": result["num_files"],
-                "average_per": result["average_per"],
-                "average_pwed": result["average_pwed"],
-                "detailed_results": result["detailed_results"],
-                "timestamp": result["timestamp"],
-            }
-        elif task:
-            # If we only found task status, return that
-            return task
-        else:
-            return {"error": f"No results found for '{query}'"}
-    except Exception as e:
-        print(f"Error checking status: {e}")
-        return {"error": f"Error checking status: {str(e)}"}
-def start_eval_task(
-    model_name: str, submission_name: str, github_url: Optional[str] = None
-) -> str:
-    """Start evaluation task in background. Returns task ID that can be used to check status."""
-    # Generate a task ID
-    task_id = str(uuid.uuid4())
-    # Create task entry
-    task = {
-        "id": task_id,
-        "model": model_name,
-        "subset": "test",
-        "submission_name": submission_name,
-        "github_url": github_url,
-        "status": "queued",
-        "submitted_at": datetime.now().isoformat(),
-    }
-    # Save task
-    tasks = storage_manager.load("tasks")
-    tasks.append(task)
-    storage_manager.save("tasks", tasks)
-    # Start evaluation in background
-    asyncio.run(_eval_task(task_id, model_name, submission_name, "test", github_url))
-    return task_id
-async def _eval_task(
-    task_id: str,
-    model_name: str,
-    submission_name: str,
-    subset: str = "test",
-    github_url: Optional[str] = None,
-    max_samples: Optional[int] = None,
-):
     """Background task to evaluate model and save updated results"""
     try:
         # Indicate task is processing
-        storage_manager.update_task(task_id, {"status": "processing"})
         # Evaluate model
-        result = evaluate_model(model_name, subset, max_samples)
-        avg_per = result["average_per"]
-        avg_pwed = result["average_pwed"]
         # Save results
-        print("Saving results...")
-        current_results = storage_manager.load("results")
-        current_results.append(result)
-        storage_manager.save("results", current_results)
-        # Update leaderboard
-        print("Updating leaderboard...")
-        leaderboard = storage_manager.load("leaderboard")
-        entry = next(
-            (e for e in leaderboard if e["submission_name"] == submission_name),
-            None,
-        )
-        if entry:
-            # Simply update with new scores
-            entry.update(
-                {
-                    "task_id": task_id,
-                    "average_per": avg_per,
-                    "average_pwed": avg_pwed,
-                    "model": model_name,
-                    "subset": subset,
-                    "github_url": github_url,
-                    "submission_date": datetime.now().isoformat(),
-                }
             )
-        else:
-            leaderboard.append(
-                {
-                    "task_id": task_id,
-                    "submission_id": str(uuid.uuid4()),
-                    "submission_name": submission_name,
-                    "model": model_name,
-                    "average_per": avg_per,
-                    "average_pwed": avg_pwed,
-                    "subset": subset,
-                    "github_url": github_url,
-                    "submission_date": datetime.now().isoformat(),
-                }
-            )
-        storage_manager.save("leaderboard", leaderboard)
-        storage_manager.update_task(task_id, {"status": "completed"})
-        print("Evaluation completed successfully")
     except Exception as e:
-        error_msg = f"Evaluation failed: {str(e)}"
-        print(error_msg)
-        storage_manager.update_task(task_id, {"status": "failed", "error": error_msg})

+# This modules handles the task queue
+import os
+import multiprocessing
+from typing import TypedDict
 from datetime import datetime
+from metrics import per, fer
+from datasets import load_from_disk
+from hf import get_repo_info, add_leaderboard_entry
+from inference import clear_cache, load_model, transcribe
+leaderboard_lock = multiprocessing.Lock()
+class Task(TypedDict):
+    status: str
+    display_name: str
+    repo_id: str
+    repo_hash: str
+    repo_last_modified: datetime
+    submission_timestamp: datetime
+    url: str
+    error: str | None
+tasks: list[Task] = []
+def get_status(query: str) -> dict:
+    """Check status of an evaluation task by repo_id or repo_hash"""
+    query = query.strip().lower()
+    if not query:
+        return {"error": "Please enter a model id or task id"}
+    for task in reversed(tasks):
+        if task["repo_id"].lower() == query or task["repo_hash"].lower() == query:
+            return dict(task)
+    return {"error": f"No results found for '{query}'"}
+def start_eval_task(display_name: str, repo_id: str, url: str) -> str:
+    """Start evaluation task in background. Returns task ID that can be used to check status."""
+    repo_hash, last_modified = get_repo_info(repo_id)
+    # TODO: check if hash is different from the most recent submission if any for repo_id, otherwise don't recompute
+    task = Task(
+        status="submitted",
+        display_name=display_name,
+        repo_id=repo_id,
+        repo_hash=repo_hash,
+        repo_last_modified=last_modified,
+        submission_timestamp=datetime.now(),
+        url=url,
+        error=None,
+    )
+    manager = multiprocessing.Manager()
+    task_proxy = manager.dict(task)
+    tasks.append(task_proxy)  # type: ignore
+    multiprocessing.Process(
+        target=_eval_task, args=[task_proxy, leaderboard_lock]
+    ).start()
+    return repo_hash
+test_ds = load_from_disk(os.path.join(os.path.dirname(__file__), "data", "test"))
+def _eval_task(task: Task, leaderboard_lock):
     """Background task to evaluate model and save updated results"""
     try:
         # Indicate task is processing
+        task["status"] = "evaluating"
         # Evaluate model
+        average_per = 0
+        average_fer = 0
+        per_dataset_fers = {}
+        clear_cache()
+        model, processor = load_model(task["repo_id"])
+        for row in test_ds:
+            transcript = transcribe(row["audio"]["array"], model, processor)  # type: ignore
+            row_per = per(transcript, row["ipa"])  # type: ignore
+            row_fer = fer(transcript, row["ipa"])  # type: ignore
+            average_per += row_per
+            average_fer += row_fer
+            per_dataset_fers[row["dataset"]] = per_dataset_fers.get(row["dataset"], 0) + row_fer  # type: ignore
+        for key in per_dataset_fers.keys():
+            per_dataset_fers[key] /= len(test_ds.filter(lambda r: r["dataset"] == key))
+        average_per /= len(test_ds)
+        average_fer /= len(test_ds)
         # Save results
+        with leaderboard_lock:
+            add_leaderboard_entry(
+                display_name=task["display_name"],
+                repo_id=task["repo_id"],
+                repo_hash=task["repo_hash"],
+                repo_last_modified=task["repo_last_modified"],
+                submission_timestamp=task["submission_timestamp"],
+                average_per=average_per,
+                average_fer=average_fer,
+                url=task["url"],
+                per_dataset_fers=per_dataset_fers,
             )
+        # Mark task as complete
+        task["status"] = "completed"
     except Exception as e:
+        task["status"] = "failed"
+        task["error"] = str(e)

requirements.txt CHANGED Viewed

@@ -1,11 +1,17 @@
-# Core ML dependencies
-torch==2.0.1
-torchaudio==2.0.2
-transformers==4.44.2
-huggingface_hub==0.25.1
-gradio==5.12.0
-panphon==0.21.2
 # Data processing
 pandas==2.0.3
 numpy==1.25.2

+# Huggingface
+huggingface_hub==0.34.4
+datasets==4.0.0
 # Data processing
 pandas==2.0.3
 numpy==1.25.2
+panphon==0.21.2
+torch==2.8.0
+torchaudio==2.8.0
+torchcodec==0.6.0
+transformers==4.56.0
+phonemizer==3.3.0
+# UI
+gradio==5.12.0
+protobuf==6.32.0

requirements_lock.txt CHANGED Viewed

@@ -1,28 +1,100 @@
-certifi==2024.12.14
-cfgv==3.4.0
-charset-normalizer==3.4.1
-distlib==0.3.8
-filelock==3.15.4
-fsspec==2024.12.0
-huggingface-hub==0.27.1
-identify==2.5.36
 idna==3.10
-ml_dtypes==0.5.0
-nodeenv==1.9.1
-numpy==2.1.3
-onnx==1.17.0
-onnxscript==0.1.0.dev20241223
-packaging==24.2
-platformdirs==4.2.2
-pre-commit==3.7.1
-protobuf==5.29.2
-PyYAML==6.0.1
-regex==2024.11.6
-requests==2.32.3
-safetensors==0.5.2
-tokenizers==0.21.0
 tqdm==4.67.1
-transformers==4.48.0
-typing_extensions==4.12.2
-urllib3==2.3.0
-virtualenv==20.26.3

+aiofiles==23.2.1
+aiohappyeyeballs==2.6.1
+aiohttp==3.12.15
+aiosignal==1.4.0
+annotated-types==0.7.0
+anyio==4.10.0
+async-timeout==5.0.1
+attrs==25.3.0
+babel==2.17.0
+certifi==2025.8.3
+charset-normalizer==3.4.3
+click==8.2.1
+colorama==0.4.6
+csvw==3.5.1
+datasets==4.0.0
+dill==0.3.8
+dlinfo==2.0.0
+editdistance==0.8.1
+exceptiongroup==1.3.0
+fastapi==0.116.1
+ffmpy==0.6.1
+filelock==3.19.1
+frozenlist==1.7.0
+fsspec==2025.3.0
+gradio==5.12.0
+gradio_client==1.5.4
+h11==0.16.0
+hf-xet==1.1.9
+httpcore==1.0.9
+httpx==0.28.1
+huggingface-hub==0.34.4
 idna==3.10
+isodate==0.7.2
+Jinja2==3.1.6
+joblib==1.5.2
+jsonschema==4.25.1
+jsonschema-specifications==2025.4.1
+language-tags==1.2.0
+markdown-it-py==4.0.0
+MarkupSafe==2.1.5
+mdurl==0.1.2
+mpmath==1.3.0
+multidict==6.6.4
+multiprocess==0.70.16
+munkres==1.1.4
+networkx==3.4.2
+numpy==1.25.2
+orjson==3.11.3
+packaging==25.0
+pandas==2.0.3
+panphon==0.21.2
+phonemizer==3.3.0
+pillow==11.3.0
+propcache==0.3.2
+protobuf==6.32.0
+pyarrow==21.0.0
+pydantic==2.11.7
+pydantic_core==2.33.2
+pydub==0.25.1
+Pygments==2.19.2
+pyparsing==3.2.3
+python-dateutil==2.9.0.post0
+python-multipart==0.0.20
+pytz==2025.2
+PyYAML==6.0.2
+rdflib==7.1.4
+referencing==0.36.2
+regex==2025.9.1
+requests==2.32.5
+rfc3986==1.5.0
+rich==14.1.0
+rpds-py==0.27.1
+ruff==0.12.11
+safehttpx==0.1.6
+safetensors==0.6.2
+segments==2.3.0
+semantic-version==2.10.0
+shellingham==1.5.4
+six==1.17.0
+sniffio==1.3.1
+starlette==0.47.3
+sympy==1.14.0
+tokenizers==0.22.0
+tomlkit==0.13.3
+torch==2.8.0
+torchaudio==2.8.0
+torchcodec==0.6.0
 tqdm==4.67.1
+transformers==4.56.0
+typer==0.17.3
+typing-inspection==0.4.1
+typing_extensions==4.15.0
+tzdata==2025.2
+unicodecsv==0.14.1
+uritemplate==4.2.0
+urllib3==2.5.0
+uvicorn==0.35.0
+websockets==14.2
+xxhash==3.5.0
+yarl==1.20.1

scripts/download_data_curl.sh DELETED Viewed

@@ -1,3 +0,0 @@
-# install ./.data/TIMIT.zip from https://www.kaggle.com/datasets/mfekadu/darpa-timit-acousticphonetic-continuous-speech?resource=download
-curl -L -o ./queue/data/TIMIT.zip\
-  https://www.kaggle.com/api/v1/datasets/download/mfekadu/darpa-timit-acousticphonetic-continuous-speech

scripts/download_data_lfs.sh DELETED Viewed

	@@ -1,2 +0,0 @@
1	- # Download the TIMIT.zip dataset
2	- git lfs pull --include="./queue/data/TIMIT.zip"

scripts/install.sh DELETED Viewed

@@ -1,19 +0,0 @@
-# Create a virtual environment with Python 3.10
-python3.10 -m venv venv
-# Activate the virtual environment
-. ./venv/bin/activate
-# Install the required dependencies
-pip install -r requirements_lock.txt
-# Download data
-# check if git lfs is installed and run the appropriate script, otherwise run the curl script
-if [ -x "$(command -v git-lfs)" ]; then
-  . ./scripts/download_data_lfs.sh
-else
-  . ./scripts/download_data_curl.sh
-fi
-# Deactivate the virtual environment
-deactivate

scripts/run-dev.sh CHANGED Viewed

@@ -1,8 +1,2 @@
-# Activate the virtual environment
-. ./venv/bin/activate
 # Run the app with auto-reload enabled
 gradio app/app.py
-# Deactivate the virtual environment
-deactivate





1	# Run the app with auto-reload enabled
2	gradio app/app.py

scripts/run-prod.sh CHANGED Viewed

@@ -1,8 +1,2 @@
-# Activate the virtual environment
-. ./venv/bin/activate
 # Run the app without auto-reload
 python app/app.py
-# Deactivate the virtual environment
-deactivate





1	# Run the app without auto-reload
2	python app/app.py

scripts/sample_test_set.py ADDED Viewed

	@@ -0,0 +1,33 @@

+#!/usr/bin/env python3
+import os
+from datasets import load_dataset, concatenate_datasets, Dataset
+SEED = 42
+SAMPLE_SIZE = 100
+testsets: list[tuple[str, Dataset]] = [
+    ("TIMIT", load_dataset("KoelLabs/TIMIT")["test"]),
+    ("EpaDB", load_dataset("KoelLabs/EpaDB")["test"]),
+    ("PSST", load_dataset("KoelLabs/PSST")["test"]),
+    ("SpeechOcean", load_dataset("KoelLabs/SpeechOceanNoTH")["test"]),
+    ("ISLE", load_dataset("KoelLabs/ISLE")["train"]),
+]  # type: ignore
+all_datasets = []
+for name, test_ds in testsets:
+    shuffled_ds = test_ds.shuffle(seed=SEED)
+    sample_ds = shuffled_ds.select(range(SAMPLE_SIZE))
+    sample_ds = sample_ds.add_column("dataset", [name] * len(sample_ds))  # type: ignore
+    sample_ds = sample_ds.remove_columns(
+        [
+            col
+            for col in sample_ds.column_names
+            if col not in ["audio", "ipa", "dataset"]
+        ]
+    )
+    all_datasets.append(sample_ds)
+combined_ds: Dataset = concatenate_datasets(all_datasets)
+os.makedirs(os.path.join("app", "data"), exist_ok=True)
+combined_ds.save_to_disk(os.path.join("app", "data", "test"))