Suzie Oh

ohsuz

AI & ML interests

None yet

Recent Activity

liked a dataset 7 days ago
open-thoughts/OpenThoughts3-1.2M
liked a Space 7 days ago
VIDraft/SOMA-AGI
reacted to openfree's post with πŸ”₯ 20 days ago
🌏 Whisper-OCR Multilingual Translation Space πŸš€ Welcome! This Space takes English audio, video, images, and PDFs and instantly converts them into Chinese (ZH), Thai (TH), and Russian (RU)β€”no other source language required. https://huggingface.co/spaces/VIDraft/voice-trans ✨ Key Features 🎀 Microphone – Record English speech β†’ transcript + 3-language translation πŸ”Š Audio File – Upload English audio β†’ transcript + translation 🎬 Video File – Auto-extract audio with FFmpeg β†’ transcript + translation πŸ–ΌοΈ Image – Nanonets-OCR pulls text β†’ translation πŸ“„ PDF – Up to 50 pages of text & tables β†’ translation πŸ”„ Realtime Mode – Flush every 10-15 s; newest lines appear at the top πŸ› οΈ Quick Start Click β€œDuplicate” to fork, or launch directly. Pick a tab (🎀/πŸ”Š/🎬/πŸ–ΌοΈ/πŸ“„/πŸ”„) and feed it English input. After a few seconds, see the πŸ“œ original and 🌐 3-language translation side by side. ⚑ Tech Stack openai/whisper-large-v3-turbo β€” fast, high-accuracy ASR Nanonets-OCR-s (+ Flash Attention 2) β€” document/image OCR Gradio Blocks β€” clean tabbed UI PyTorch + CUDA β€” auto GPU allocation & ThreadPool load balancing πŸ“Œ Notes Translation quality depends on audio quality, lighting, and resolution. Huge videos hit the HF Space upload cap (~2 GB). Realtime tab requires browser microphone permission.
View all activity

Organizations

HAE-RAE's profile picture FIN_NUMBER's profile picture Translation-EnKo's profile picture KKACHI's profile picture