FinAI-BERT-IslamicBanks

A Domain-Specific BERT Model for Detecting AI Disclosures in Islamic Banking


Model Description

FinAI-BERT-IslamicBanks is a fine-tuned transformer-based model based on bert-base-uncased, specifically developed to detect AI-related disclosures in the context of Islamic banking. The model is tailored for financial NLP tasks and trained on manually annotated sentences from 855 annual reports issued by 106 Islamic banks across 25 countries between 2015 and 2024.


Intended Use

The model is designed to support:

  • Academic research on AI adoption in Islamic finance
  • Regulatory screening of AI-related disclosures
  • Technology and ESG audits of Islamic financial institutions
  • Index construction for benchmarking AI readiness in Islamic banking

Performance

  • Accuracy: 98.67%
  • F1 Score: 0.9868
  • ROC AUC: 0.9999
  • Brier Score: 0.0027

The model demonstrates high semantic sensitivity, excellent calibration, and strong generalization across diverse report formats.


Training Data

  • Total examples: 2,632 sentence-level instances
    • 1,316 AI-related (seed word filtered + manually verified)
    • 1,316 Non-AI (randomly sampled)

Training Setup

  • Base model: bert-base-uncased
  • Tokenizer: WordPiece
  • Environment: Python 3, Jupyter, Google Compute Engine (GPU)
  • Batch size: 8
  • Epochs: 3
  • Max sequence length: 128
  • Loss function: Cross-entropy
  • Optimizer: AdamW
  • Precision: FP16 (mixed-precision enabled)
  • Framework: Hugging Face Transformers (Trainer API)

Files Included

  • config.json, tokenizer.json, vocab.txt, model.safetensors: Model files
  • tokenizer_config.json, special_tokens_map.json: Tokenizer configuration

Supplementary Material

The following supplementary files are provided to support the implementation and replication of FinAI-BERT-IslamicBanks. All materials are included in the archive titled Supplementary_Material.zip:

  • FinAI-BERT-IslamicBanks Training Data Extraction.ipynb – Python notebook for corpus preprocessing, sentence segmentation, and annotation
  • ai_seedwords.csv – Lexicon of AI-related terms used to guide weak supervision during annotation
  • bert_training_data.csv – Annotated dataset containing AI and Non-AI sentences
  • FinAI-BERT-IslamicBanks.ipynb – Notebook for model training and evaluation

πŸ“Œ Citation

If you use this model in your research or applications, please cite our paper:

Zafar, M. B. (2025). FinAI-BERT-IslamicBanks: A Domain-Specific Model for Detecting AI Disclosures in Islamic Banking. SSRN. https://ssrn.com/abstract=5337214


Model Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("bilalzafar/FinAI-BERT-IslamicBanks")
model = AutoModelForSequenceClassification.from_pretrained("bilalzafar/FinAI-BERT-IslamicBanks")

# Create the pipeline
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

# Label mapping
label_map = {
    "LABEL_0": "Non-AI",
    "LABEL_1": "AI"
}

# Input text
text = "Our Shariah-compliant bank has deployed AI-driven credit risk assessment tools."

# Run classification
result = classifier(text)[0]

# Output
label = label_map.get(result['label'], result['label'])
score = result['score']
print(f"Classification: {label} | Score: {score:.4f}")
Downloads last month
22
Safetensors
Model size
109M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for bilalzafar/FinAI-BERT-IslamicBanks

Finetuned
(5476)
this model