Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play Paper • 2505.02707 • Published 24 days ago • 80
Slam Collection All resources for SpeechLMs from "Slamming: Training a Speech Language Model on One GPU in a Day". We provide tokeniser, lm, and datasets • 7 items • Updated 7 days ago • 13
Hibiki fr-en Collection Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. • 5 items • Updated Feb 6 • 53
view article Article Evaluating Audio Reasoning with Big Bench Audio By mhillsmith and 1 other • Dec 20, 2024 • 21
view article Article Yay! Organizations can now publish blog Articles By huggingface and 3 others • Jan 20 • 45
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 24 days ago • 265
Open Whisper-style Speech Models (OWSM) Collection Fully open Whisper-style speech foundation models developed by CMU WAVLab: https://www.wavlab.org/activities/2024/owsm/ • 19 items • Updated 10 days ago • 5
view article Article TTS Arena: Benchmarking Text-to-Speech Models in the Wild By mrfakename and 6 others • Feb 27, 2024 • 66
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated 24 days ago • 226
view article Article SmolLM - blazingly fast and remarkably powerful By loubnabnl and 2 others • Jul 16, 2024 • 374