Index Card Classifier
A fine-tuned image classifier for detecting library index cards vs other image types (verso/back of cards, covers, blank pages).
Model Details
- Base Model: efficientnet_b0
- Task: Binary classification (index_card vs other)
- Training Data: 70 images
- Validation Data: 30 images
- Framework: PyTorch + timm
Training Data Distribution
{
"cover": 1,
"blank": 1,
"needs_review": 1,
"verso": 49,
"index_card": 46,
"other": 2
}
Performance
| Metric | Value |
|---|---|
| Validation Accuracy | 96.7% |
| Validation F1 (index_card) | 0.966 |
| Validation F1 (other) | 0.968 |
Classification Report
precision recall f1-score support
other 1.000 0.938 0.968 16
index_card 0.933 1.000 0.966 14
accuracy 0.967 30
macro avg 0.967 0.969 0.967 30
weighted avg 0.969 0.967 0.967 30
Confusion Matrix
[[15 1]
[ 0 14]]
Usage
import timm
import torch
from huggingface_hub import hf_hub_download
from PIL import Image
from safetensors.torch import load_file
from torchvision import transforms
# Download and load model from Hub
weights_path = hf_hub_download(
repo_id="davanstrien/nls-index-card-classifier",
filename="classifier.safetensors"
)
model = timm.create_model('efficientnet_b0', pretrained=False, num_classes=2)
model.load_state_dict(load_file(weights_path))
model.eval()
# Preprocess
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
# Inference
image = Image.open('card.jpg').convert('RGB')
input_tensor = transform(image).unsqueeze(0)
with torch.no_grad():
output = model(input_tensor)
probs = torch.softmax(output, dim=1)
pred = output.argmax(1).item()
confidence = probs[0, pred].item()
classes = ['other', 'index_card']
print(f"Prediction: {classes[pred]} ({confidence:.1%})")
Training
Trained using frozen backbone with only classifier head fine-tuned.
python train_classifier.py --model efficientnet_b0 --epochs 20 --val-split 0.3
Context
This model was developed for the National Library of Scotland to help process digitized manuscript index cards from the Advocate's Library collection.
License
Apache 2.0
- Downloads last month
- -