Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
csuhan 's Collections
Tar
OneLLM

Tar

updated 2 days ago

Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Upvote
-

  • Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

    Paper • 2506.18898 • Published 2 days ago • 23

  • Running on A10G
    59
    59

    Tar

    🚀

    Unified MLLM with Text-Aligned Representations

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs