Enhancing Vision-Language Model with Unmasked Token Alignment Paper • 2405.19009 • Published May 29, 2024 • 1
MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment Paper • 2406.19736 • Published Jun 28, 2024 • 3