--- title: Speech Segmenter (STT) emoji: 🏃 colorFrom: gray colorTo: blue sdk: gradio sdk_version: 5.39.0 app_file: app.py pinned: false short_description: Advanced audio transcription with alignment & diarization --- This Space provides an advanced **Speech-to-Text (STT)** pipeline enhanced with alignment and speaker diarization: - **STT (Speech-to-Text):** Converts spoken audio into written text (transcription). - **Alignment:** Aligns words with their timestamps in the audio (word-level timing). - **Speaker Diarization:** Detects and labels who spoke when — the “who spoke what” part. - **Post-processing:** Combines all that info to produce a richer, structured transcript. Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference