OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview Image-Text-to-Text âĒ 0.4B âĒ Updated 6 days ago âĒ 10.7k âĒ 61
view post Post 2601 Transcribing 1 hour of audio for less than $0.01 ðĪŊ @mfuntowicz cooked with 8x faster Whisper speech recognition - whisper-large-v3-turbo transcribes at 100x real time on a $0.80/hr L4 GPU!How they did it: https://huggingface.co/blog/fast-whisper-endpoints1-click deploy with HF Inference Endpoints: https://endpoints.huggingface.co/new?repository=openai%2Fwhisper-large-v3-turbo&vendor=aws®ion=us-east&accelerator=gpu&instance_id=aws-us-east-1-nvidia-l4-x1&task=automatic-speech-recognition&no_suggested_compute=true See translation ð 10 10 + Reply
Running 155 155 SmolVLM realtime WebGPU ⥠Interact with camera to get descriptions based on instructions