cheesecz commited on
Commit
656229d
·
verified ·
1 Parent(s): 851494e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md CHANGED
@@ -12,3 +12,54 @@ short_description: filler
12
  ---
13
 
14
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
15
+
16
+ # 🎙️ CrisperWhisper Speech-to-Text
17
+
18
+ This Hugging Face Space provides a speech-to-text transcription service powered by the [nyrahealth/CrisperWhisper](https://huggingface.co/nyrahealth/CrisperWhisper) model. Upload audio files and get transcribed text with word-level timestamps.
19
+
20
+ ## Features
21
+
22
+ - Transcribe audio files to text with word-level timestamps
23
+ - Support for multiple audio formats (MP3, WAV, M4A, OGG, FLAC)
24
+ - Up to 30MB file size support
25
+ - Simple web interface using Gradio
26
+ - REST API endpoint for programmatic access
27
+
28
+ ## How to Use
29
+
30
+ 1. Upload an audio file using the interface
31
+ 2. Click "Transcribe"
32
+ 3. View both the plain text transcription and detailed JSON output with timestamps
33
+
34
+ ## API Usage
35
+
36
+ You can also use this Space programmatically via the REST API:
37
+
38
+ ```python
39
+ import requests
40
+
41
+ url = "https://your-space-name.hf.space/api/predict"
42
+ files = {'audio_input': open('/path/to/your-audio-file.mp3', 'rb')}
43
+
44
+ response = requests.post(url, files=files)
45
+ print(response.json())
46
+ ```
47
+
48
+ ## Model Details
49
+
50
+ This app uses the [nyrahealth/CrisperWhisper](https://huggingface.co/nyrahealth/CrisperWhisper) model, which is optimized for high-quality speech transcription with timestamp information.
51
+
52
+ ## System Requirements
53
+
54
+ For optimal performance, this Space should be run with:
55
+ - GPU acceleration
56
+ - At least 8GB RAM
57
+
58
+ ---
59
+
60
+ tags:
61
+ - speech-to-text
62
+ - transcription
63
+ - whisper
64
+ - gradio
65
+ - audio-processing