Running 9 9 NVIDIA Parakeet TDT 0.6B V2 Real Time Mic Transcription ASR STT 📊 Real-Time, Speak to Mic, NO MODEL DOWNLOAD NEEDED!!
Turkish Instruction Datasets Collection Collection of instruction datasets for Turkish. • 42 items • Updated May 24 • 13
Sleeping 20 20 TEN Agent with VAD and Turn Detection 🔥 A Conversational Voice AI Agent powered by the TEN Framework
view post Post 6599 A real-time object detector much faster and accurate than YOLO with Apache 2.0 license just landed to Hugging Face transformers 🔥 D-FINE is the sota real-time object detector that runs on T4 (free Colab) 🤩> Collection with all checkpoints and demo ustc-community/d-fine-68109b427cbe6ee36b4e7352Notebooks:> Tracking https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_tracking.ipynb> Inference https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_inference.ipynb> Fine-tuning https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_finetune_on_a_custom_dataset.ipynbh/t @vladislavbro @qubvel-hf @ariG23498 and the authors of the paper 🎩Regular object detectors attempt to predict bounding boxes in (x, y, w, h) pixel perfect coordinates, which is very rigid and hard to solve 🥲☹️D-FINE formulates object detection as a distribution for bounding box coordinates, refines them iteratively, and it's more accurate 🤩Another core idea behind this model is Global Optimal Localization Self-Distillation ⤵️this model uses final layer's distribution output (sort of like a teacher) to distill to earlier layers to make early layers more performant. See translation 2 replies · 🚀 14 14 👍 11 11 🔥 6 6 ❤️ 4 4 + Reply
ysdede/whisper-khanacademy-large-v3-turbo-tr Automatic Speech Recognition • 0.8B • Updated Apr 23 • 34 • 1