Model Card for Bangla Handwritten Character Recognition
This is a Bangla handwritten character recognition model that can detect 195 classes, including compound characters. The model is designed for accurate offline recognition of Bangla handwritten script.
Model Detail
Architecture: EfficientNetV2S (feature extractor)
Custom Layers: Triple-head architecture with two attention mechanisms, followed by a soft average ensemble layer.
Parameters: Total: 30,912,695
Framework: TensorFlow / Keras
Finetuned from: EfficientNetV2S pretrained on ImageNet
Developed by: Team Segfault
License: MIT
Model Description
This model is a deep learning-based handwritten character classifier designed specifically for the Bangla language. It recognizes 195 different Bangla characters, including basic vowels and consonants as well as compound letters.
Built using the EfficientNetV2S architecture as a feature extractor, the model applies a custom triple-head fully connected (FC) architecture with two attention mechanisms to enhance discriminative power. The outputs from all three heads are averaged using a soft-voting strategy to produce the final prediction, improving generalization and reducing overfitting.
The goal of this model is to support accurate offline recognition of handwritten Bangla text β useful for educational tools, digitization of documents, and OCR applications focused on Bangla script. It achieves high accuracy on a balanced dataset, thanks to a carefully designed data augmentation pipeline that includes ElasticTransform and random rotation.
This model was trained on the MatrivashaBangla dataset(a newly combinded dataset), containing over 500,000 samples. The entire training and evaluation pipeline is implemented using TensorFlow/Keras.
- Developed by: [Team Segfault]
- Model type: [Deep Neural Network]
- License: [MIT]
- Finetuned from model : [EfficientNetV2S]
Uses
- Offline OCR systems for Bangla script
- Educational software or Bangla digitization tools
- Document recognition systems, etc
Out-of-Scope Use
- Not intended for real-time inference on low-power devices
- Not designed for other scripts or languages
Training Details
Training Data
Dataset Name: MatrivashaBangla (a newly combinded dataset using BanglaLekha-Isolated and Matrivasha_raw)
Size: 195 classes, ~500,000 images
Augmentations: ElasticTransform, Rotate
Balance: Class-balanced via custom augmentation scripts (2500-2580 images per class)
Evaluation
Train Accuracy: ~97.59%
Validation Accuracy: ~96.69%
Metrics: Accuracy, F1-score, Precision, Recall
Evaluation Dataset: Test split from MatrivashaBangla (10%), Accuracy ~96.60%
Model Architecture
EfficientNetV2S (Frozen Layers) β Feature Output (Shared) β GlobalAvaragePooling β βββ FC Head 1 with Attention βββ FC Head 2 with Attention βββ FC Head 3 (Baseline) β Soft Average of 3 Head Outputs β Final Prediction (195 classes)
Limitations
- May misclassify extremely poor handwriting or characters written with noisy backgrounds
- Not tested for real-time edge deployment
- Only trained on standard handwritten script β no cursive, artistic, or stylized forms
Environmental Impact
- Compute Region: [Local training, not on cloud]
- Carbon Emitted: [ ~5β10 kg CO2]
Model Card Contact
Author: Meharaz Hossain
Email: [email protected]
GitHub: https://github.com/meharaz733
Hugging Face: https://huggingface.co/meharaz733
- Downloads last month
- 1