Robustness in Both Domains: CLIP Needs a Robust Text Encoder Paper • 2506.03355 • Published 4 days ago • 6
On the Adversarial Robustness of Multi-Modal Foundation Models Paper • 2308.10741 • Published Aug 21, 2023
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models Paper • 2402.12336 • Published Feb 19, 2024
FuseLIP: Multimodal Embeddings via Early Fusion of Discrete Tokens Paper • 2506.03096 • Published 4 days ago • 3