Collections
Discover the best community collections!
Collections including paper arxiv:2311.16567
-
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 16 -
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Paper • 2312.08583 • Published • 9 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 11 -
StemGen: A music generation model that listens
Paper • 2312.08723 • Published • 47
-
FaceStudio: Put Your Face Everywhere in Seconds
Paper • 2312.02663 • Published • 30 -
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers
Paper • 2401.08740 • Published • 12 -
DiffusionGPT: LLM-Driven Text-to-Image Generation System
Paper • 2401.10061 • Published • 27 -
MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices
Paper • 2311.16567 • Published • 22