DAB-DETR for Biomedical Subfigure Extraction
A transformer-based object detection model designed to detect and extract subfigures (panels) from compound figures in biomedical literature.
Background & Motivation
DAB-DETR ("Dynamic Anchor Boxes are Better Queries for DETR") improves upon the original DETR framework by replacing learned positional queries with dynamic anchor boxes—each defined by coordinates
(x, y, w, h)
—that are updated layer by layer to guide cross-attention and speed convergence. This formulation offers explicit positional priors for improved feature matching and demonstrated strong performance.In the Open-PMC-18M study, this model was adapted and trained on a synthetic dataset of 500,000 biomedical compound figures to extract subfigures at scale. The model achieved impressive results—mAP of 98.58 % and F1 of 99.96 % on a synthetic holdout, and strong performance on the ImageCLEF 2016 benchmark (mAP 36.88 %, F1 73.55 %).
Usage Example
from transformers import AutoModelForObjectDetection, AutoImageProcessor
from PIL import Image
import torch
model_name = "vector-institute/pmc-18m-dab-detr"
processor = AutoImageProcessor.from_pretrained(model_name)
model = AutoModelForObjectDetection.from_pretrained(model_name)
image = Image.open("compound_figure.png").convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
# Post-process detections—adjust thresholding and formatting as needed
results = processor.post_process_object_detection(outputs, target_sizes=torch.tensor([image.size[::-1]]), threshold=0.5)
for res in results:
for score, label, box in zip(res["scores"], res["labels"], res["boxes"]):
print(f"Label {label}: {score:.2f}, Box: {box.tolist()}")
Citation
If you find this model or code useful, please consider citing:
@article{baghbanzadeh2025openpmc18m,
title = {Open-PMC-18M: A High-Fidelity Large Scale Medical Dataset for Multimodal Representation Learning},
author = {Baghbanzadeh, Negin and Ashkezari, Sajad and Dolatabadi, Elham and Afkanpour, Arash},
journal = {arXiv preprint arXiv:2506.02738},
year = {2025}
}
- Downloads last month
- 7