Dynamic Relation Transformer for Contextual Text Block Detection Paper • 2401.09232 • Published Jan 17, 2024
Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks Paper • 2504.01308 • Published 13 days ago • 13
Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks Paper • 2504.01308 • Published 13 days ago • 13
Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks Paper • 2504.01308 • Published 13 days ago • 13 • 2
UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis Paper • 2503.15893 • Published 26 days ago • 2
UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis Paper • 2503.15893 • Published 26 days ago • 2 • 2
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding Paper • 2412.10302 • Published Dec 13, 2024 • 18