Spaces:

cheesecz
/

fillersark

Sleeping

App Files Files Community

fillersark / README.md

cheesecz

Update README.md

656229d verified 5 months ago

preview code

raw

history blame contribute delete

1.68 kB

	---
	title: Fillersark
	emoji: 📉
	colorFrom: pink
	colorTo: blue
	sdk: gradio
	sdk_version: 5.28.0
	app_file: app.py
	pinned: false
	license: other
	short_description: filler
	---

	Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

	# 🎙️ CrisperWhisper Speech-to-Text

	This Hugging Face Space provides a speech-to-text transcription service powered by the [nyrahealth/CrisperWhisper](https://huggingface.co/nyrahealth/CrisperWhisper) model. Upload audio files and get transcribed text with word-level timestamps.

	## Features

	- Transcribe audio files to text with word-level timestamps
	- Support for multiple audio formats (MP3, WAV, M4A, OGG, FLAC)
	- Up to 30MB file size support
	- Simple web interface using Gradio
	- REST API endpoint for programmatic access

	## How to Use

	1. Upload an audio file using the interface
	2. Click "Transcribe"
	3. View both the plain text transcription and detailed JSON output with timestamps

	## API Usage

	You can also use this Space programmatically via the REST API:

	```python
	import requests

	url = "https://your-space-name.hf.space/api/predict"
	files = {'audio_input': open('/path/to/your-audio-file.mp3', 'rb')}

	response = requests.post(url, files=files)
	print(response.json())
	```

	## Model Details

	This app uses the [nyrahealth/CrisperWhisper](https://huggingface.co/nyrahealth/CrisperWhisper) model, which is optimized for high-quality speech transcription with timestamp information.

	## System Requirements

	For optimal performance, this Space should be run with:
	- GPU acceleration
	- At least 8GB RAM

	---

	tags:
	- speech-to-text
	- transcription
	- whisper
	- gradio
	- audio-processing