Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
abdar1925 's Collections
Vision models
Reasoning Models
Code Models
Document models
Omni-models
Audio models
Papers
Models
Datasets
Embd Models

Audio models

updated 5 days ago
Upvote
-

  • kyutai/moshika-vis-pytorch-bf16

    Updated Mar 22 • 56

  • sesame/csm-1b

    Text-to-Speech • Updated Mar 16 • 48.6k • 2.01k

  • kyutai/mimi

    Feature Extraction • Updated Sep 18, 2024 • 667k • • 207

  • kyutai/moshiko-pytorch-bf16

    Updated Sep 18, 2024 • 162k • 176

  • nvidia/canary-1b-flash

    Automatic Speech Recognition • Updated Mar 18 • 229k • 197

  • canopylabs/orpheus-3b-0.1-ft

    Text-to-Speech • Updated 6 days ago • 90.5k • • 542

  • stepfun-ai/Step-Audio-Chat

    Audio-Text-to-Text • Updated Feb 17 • 245 • 440

  • Zyphra/Zonos-v0.1-hybrid

    Text-to-Speech • Updated 6 days ago • 12.9k • 1.06k

  • hexgrad/Kokoro-82M

    Text-to-Speech • Updated Apr 10 • 1.85M • • 4.29k

  • Qwen/Qwen2.5-Omni-7B

    Any-to-Any • Updated 12 days ago • 181k • 1.59k

  • ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model

    Paper • 2503.21144 • Published Mar 27 • 25

  • nari-labs/Dia-1.6B

    Text-to-Speech • Updated 7 days ago • 154k • • 2.08k

  • nvidia/parakeet-tdt-0.6b-v2

    Automatic Speech Recognition • Updated 11 days ago • 91.8k • 755
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs