Large Motion Video Autoencoding with Cross-modal Video VAE Paper • 2412.17805 • Published Dec 23, 2024 • 24
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering Paper • 2311.16465 • Published Nov 28, 2023 • 2
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models Paper • 2109.10282 • Published Sep 21, 2021 • 6