IPA-Transcription-EN / DEVELOPMENT.md
SanderGi's picture
clean up and make contribution ready
38024bc
# Development
## Design Decisions
We specifically opt for a single-space leaderboard for simplicity. We solve the issue of keeping the gradio UI interactive while models are evaluating by using background tasks instead of a separate space.
## Setup
### Prerequisites
* Python 3.10
* Git
* A love for speech recognition! 🎀
### Quick Installation
1. Clone this repository:
```bash
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/spaces/KoelLabs/IPA-Transcription-EN
cd IPA-Transcription-EN
```
2. Set up your environment and download data:
```bash
. ./scripts/install.sh
```
3. Launch the leaderboard in development mode (auto-reloads on code changes):
```bash
. ./scripts/run-dev.sh
```
4. Visit `http://localhost:7860` in your browser and see the magic! ✨
## Adding/Removing Dependencies
0. Activate the virtual environment with `. ./venv/bin/activate`
1. Add the dependency to `requirements.txt` (or remove it)
2. Make sure you have no unused dependencies with `pipx run deptry .`
3. Run `pip install -r requirements.txt`
4. Freeze the dependencies with `pip freeze > requirements_lock.txt`
## Run without reloading
```bash
. ./scripts/run-prod.sh
```
## File Structure
The two most imporant files are `app/app.py` for the main gradio UI and `app/tasks.py` for the background tasks that evaluate models.
```
IPA-Transcription-EN/
β”œβ”€β”€ README.md # General information about the leaderboard
β”œβ”€β”€ CONTRIBUTING.md # Contribution guidelines
β”œβ”€β”€ DEVELOPMENT.md # Development setup and design decisions
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ requirements_lock.txt # Locked dependencies
β”œβ”€β”€ scripts # Helper scripts
β”‚ β”œβ”€β”€ install.sh # Install dependencies and download data
β”‚ └── run-dev.sh # Run the leaderboard in development mode
β”œβ”€β”€ venv # Virtual environment
β”œβ”€β”€ app/ # All application code lives here
β”‚ β”œβ”€β”€ data/ # Phoneme transcription datasets
β”‚ β”œβ”€β”€ queue/ # Stores leaderboard state and task status
β”‚ | β”œβ”€β”€ tasks.json # Task queue
β”‚ | β”œβ”€β”€ results.json # Detailed evaluation results
β”‚ | └── leaderboard.json # Compact results for leaderboard display
β”‚ β”œβ”€β”€ app.py # Main Gradio UI
β”‚ β”œβ”€β”€ tasks.py # Background tasks for model evaluation
β”‚ β”œβ”€β”€ data.py # Data loading and processing
β”‚ β”œβ”€β”€ inference.py # Model inference
β”‚ └── phone_metrics.py # Evaluation metrics
└── img/ # Images for README and other documentation
```