Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification
Abstract
Detecting out-of-distribution (OOD) data is crucial in machine learning applications to mitigate the risk of model overconfidence, thereby enhancing the reliability and safety of deployed systems. The majority of existing OOD detection methods predominantly address uni-modal inputs, such as images or texts. In the context of multi-modal documents, there is a notable lack of extensive research on the performance of these methods, which have primarily been developed with a focus on computer vision tasks. We propose a novel methodology termed as attention head masking (AHM) for multi-modal OOD tasks in document classification systems. Our empirical results demonstrate that the proposed AHM method outperforms all state-of-the-art approaches and significantly decreases the false positive rate (FPR) compared to existing solutions up to 7.5\%. This methodology generalizes well to multi-modal data, such as documents, where visual and textual information are modeled under the same Transformer architecture. To address the scarcity of high-quality publicly available document datasets and encourage further research on OOD detection for documents, we introduce FinanceDocs, a new document AI dataset. Our code and dataset are publicly available.
Community
📝 Announcing our paper that introduces (i) a novel Attention Head Masking (AHM) technique that significantly enhances out-of-distribution (OOD) detection in multimodal document classification, (ii) and a new dataset, FinanceDocs, to support further AI-based document intelligence research.
- Novel AHM Methodology: The paper proposes the AHM technique, which improves feature representation and enhances OOD detection performance in transformer-based multimodal document classification.
- Empirical Success: AHM significantly reduces the false positive rate in OOD detection compared to state-of-the-art approaches, demonstrating up to a 7.5% improvement.
- New Dataset: The introduction of FinanceDocs, a high-quality multimodal dataset for OOD detection in documents, addresses the lack of suitable public datasets in this domain.
Hi @amanchadha ,
Congrats on this work! Are you planning to share any artifacts (datasets, models, a demo as a Space) on the hub?
Would be cool for people to further improve upon this method for detecting OOD in document classification.
Happy to assist :)
Cheers,
Niels from HF
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Out-of-Distribution Detection through Soft Clustering with Non-Negative Kernel Regression (2024)
- FlowCon: Out-of-Distribution Detection using Flow-Based Contrastive Learning (2024)
- LAPT: Label-driven Automated Prompt Tuning for OOD Detection with Vision-Language Models (2024)
- OAML: Outlier Aware Metric Learning for OOD Detection Enhancement (2024)
- TTA-OOD: Test-time Augmentation for Improving Out-of-Distribution Detection in Gastrointestinal Vision (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper