Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Giuliano 's Collections
Agents 2.0
Multimodal
Voice
Video Gen
text2sql
Medicine
LLM Personalization
Agents
Agents SWE
Agents GUI
LLM Reasoning

Voice

updated Jan 6
Upvote
-

  • VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

    Paper • 2501.01957 • Published Jan 3 • 47
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs