ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations Paper • 2504.00824 • Published Apr 1 • 41
ABC: Achieving Better Control of Multimodal Embeddings using VLMs Paper • 2503.00329 • Published Mar 1 • 19
VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation Paper • 2412.00927 • Published Dec 1, 2024 • 28
VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation Paper • 2312.14867 • Published Dec 22, 2023 • 1
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision Paper • 2411.07199 • Published Nov 11, 2024 • 50