--- language: - en tags: - vision-transformer - dinov2 # Or another base architecture if more appropriate - neuropathology - image-classification # Or image-segmentation, object-detection, etc. - university-of-kentucky # - add-other-relevant-tags-here license: "apache-2.0" # IMPORTANT: Replace with your actual chosen license ID (e.g., mit, cc-by-nc-4.0). Must be a valid SPDX license identifier or 'other'. datasets: - "uky-neuropathology-placeholder" # IMPORTANT: Replace with an actual dataset identifier if available on the Hub, or a descriptive name for your dataset. Cannot be empty. # pipeline_tag: "image-classification" # Uncomment and set if applicable (e.g., image-classification, image-segmentation) base_model: "facebook/dinov2-giant" # IMPORTANT: Replace with the actual Hugging Face Hub model ID of the base model if this is a fine-tune (e.g., google/vit-base-patch16-224-in21k). If not fine-tuned from a Hub model, REMOVE this entire 'base_model' line. It cannot be empty if present. # metrics: # Uncomment and fill if you have structured evaluation results # - accuracy # - f1 # - roc_auc # model-index: # For detailed, structured evaluation results (see Hugging Face docs) # - name: "[Your Model Name]" # results: # - task: # type: "image-classification" # e.g., image-classification # dataset: # name: "UKy Neuropathology Test Set Placeholder" # e.g., UKy Neuropathology Test Set # type: "private" # e.g., private-institutional-dataset, or a Hub dataset identifier # metrics: # - name: "Accuracy" # type: "accuracy" # value: 0.0 # e.g., 0.925 # - name: "F1-score" # type: "f1" # value: 0.0 # e.g., 0.924 # source: # name: "Internal Evaluation Report Placeholder" # e.g., Internal Evaluation Report or Link to Paper # url: "" # Link if available co2_emissions: # This is the standard field name emissions: 1.0 # IMPORTANT: Replace with your estimated CO2 emissions in kg. This is a placeholder value. source: "Estimated" # IMPORTANT: Replace with how you got this value (e.g., "ML CO2 Impact tool", "CodeCarbon", "Estimated") # training_type: "fine-tuning" # Optional: e.g., pretraining, fine-tuning # geographical_location: "Lexington, KY, USA" # Optional # hardware_used: "NVIDIA A100" # Optional #thumbnail: "url-to-your-thumbnail-image.jpg" # Optional: URL to a thumbnail image for the model card --- # Model Card for Neuropathology Vision Transformer This model is a Vision Transformer adapted for neuropathology tasks, developed using data from the University of Kentucky. It leverages principles from self-supervised learning models like DINOv2. ## Model Details * **Model Type:** Vision Transformer (ViT) for neuropathology. * **Developed by:** Center for Applied Artificial Intelligence (CAAI) * **Model Date:** 05/2025 * **Base Model Architecture:** Dinov2-giant (https://huggingface.co/facebook/dinov2-giant) * **Input:** Image (224x224). * **Output:** Class token and patch tokens. These can be used for various downstream tasks (e.g., classification, segmentation, similarity search). * **Embedding Dimension:** 1536 * **Patch Size:** 14 * **Image Size Compatibility:** * The model was trained on images/patches of size 224x224. * The model can accept images of any size, not just the 224x224 dimensions used in training. * **License:** Apache 2.0 ## Intended Uses This model is intended for research purposes in the field of neuropathology. * **Primary Intended Uses:** * Classification of tissue samples based on the presence/severity of neuropathological changes. * Feature extraction for quantitative analysis of neuropathology. ## How to Get Started with the Model Three example methods using Hugging Face `transformers` (adjust based on your actual model and task): ```python import torch from PIL import Image from transformers import AutoModel, AutoImageProcessor from torchvision import transforms def get_embeddings_with_processor(image_path, model_path): """ Extract embeddings using a HuggingFace image processor. This approach handles normalization and resizing automatically. Args: image_path: Path to the image file model_path: Path to the model directory processor_path: Path to the processor config directory Returns: Image embeddings from the model """ # Load model model = AutoModel.from_pretrained(model_path) model.eval() # Load processor from config image_processor = AutoImageProcessor.from_pretrained(model_path) # Process the image with torch.no_grad(): image = Image.open(image_path).convert('RGB') inputs = image_processor(images=image, return_tensors="pt") outputs = model(**inputs) embeddings = outputs.last_hidden_state[:, 0, :] return embeddings def get_embeddings_direct(image_path, model_path, mean=[0.83800817, 0.6516568, 0.78056043], std=[0.08324149, 0.09973671, 0.07153901]): """ Extract embeddings directly without an image processor. This approach works with various image resolutions since transformers handle different input sizes by design. Args: image_path: Path to the image file model_path: Path to the model directory mean: Normalization mean values std: Normalization standard deviation values Returns: Image embeddings from the model """ # Load model model = AutoModel.from_pretrained(model_path) model.eval() # Define transformation - just converting to tensor and normalizing transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize(mean=mean, std=std) ]) # Process the image with torch.no_grad(): # Open image and convert to RGB image = Image.open(image_path).convert('RGB') # Convert image to tensor image_tensor = transform(image).unsqueeze(0) # Add batch dimension # Feed to model outputs = model(pixel_values=image_tensor) # Get embeddings embeddings = outputs.last_hidden_state[:, 0, :] return embeddings def get_embeddings_resized(image_path, model_path, size=(224, 224), mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]): """ Extract embeddings with explicit resizing to 224x224. This approach ensures consistent input size regardless of original image dimensions. Args: image_path: Path to the image file model_path: Path to the model directory size: Target size for resizing (default: 224x224) mean: Normalization mean values std: Normalization standard deviation values Returns: Image embeddings from the model """ # Load model model = AutoModel.from_pretrained(model_path) model.eval() # Define transformation with explicit resize transform = transforms.Compose([ transforms.Resize(size, interpolation=transforms.InterpolationMode.BICUBIC), transforms.ToTensor(), transforms.Normalize(mean=mean, std=std) ]) # Process the image with torch.no_grad(): image = Image.open(image_path).convert('RGB') image_tensor = transform(image).unsqueeze(0) # Add batch dimension outputs = model(pixel_values=image_tensor) embeddings = outputs.last_hidden_state[:, 0, :] return embeddings # Example usage if __name__ == "__main__": image_path = "test.jpg" model_path = "IBI-CAAI/NP-TEST-0" # Method 1: Using image processor (recommended for consistency) embeddings1 = get_embeddings_with_processor(image_path, model_path) print('Embedding shape (with processor):', embeddings1.shape) # Method 2: Direct approach without resizing (works with various resolutions) embeddings2 = get_embeddings_direct(image_path, model_path) print('Embedding shape (direct):', embeddings2.shape) # Method 3: With explicit resize to 224x224 embeddings3 = get_embeddings_resized(image_path, model_path) print('Embedding shape (resized):', embeddings3.shape) ``` ## Training Data * **Dataset(s):** The model was trained on data from the University of Kentucky. * **Name/Identifier:** UK Alzheimer's Disease Center Neuropathology Whole Slide Image Cohort [BDSA TEST v1.0] * **Source:** [UK-ADRC Neuropathology Lab at the University of Kentucky University of Kentucky](https://neuropathlab.createuky.net/), [PLACEHOLDER: Specific Department, Center, or PI, e.g., Sanders-Brown Center on Aging, Department of Pathology] * **Description:** [PLACEHOLDER: Describe the data. E.g., "Digitized whole slide images (WSIs) of human post-mortem brain tissue sections from [number] subjects. Sections were stained with [e.g., Hematoxylin and Eosin (H&E), and immunohistochemistry for Amyloid-beta (Aβ) and phosphorylated Tau (pTau)]. Images were acquired using [e.g., Aperio AT2 scanner at 20x magnification]."] * **Preprocessing:** [PLACEHOLDER: Describe significant preprocessing steps. E.g., "WSIs were tiled into non-overlapping [e.g., 224x224 pixel] patches. Tiles with excessive background or artifacts were excluded. Color normalization using [Method, e.g., Macenko method] was applied."] * **Annotation (if applicable for supervised fine-tuning or evaluation):** [PLACEHOLDER: Describe the annotation process. E.g., "Regions of interest (ROIs) for [pathologies] were annotated by board-certified neuropathologists. For classification tasks, slide-level or region-level labels for [disease/pathology presence/severity] were provided."] ## Training Procedure * **Training System/Framework:** DINO-MX (Modular & Flexible Self-Supervised Training Framework) * **Base Model (if fine-tuning):** Pretrained `facebook/dinov2-giant` loaded from Hugging Face Hub. * **Training Objective(s):** Self-supervised learning using DINO loss, iBOT masked-image modeling loss. * **Key Hyperparameters (example):** * Batch size: 32 * Learning rate: 1.0e-4 * Epochs/Iterations: 5000 Iterations * Optimizer: AdamW * Weight decay: 0.04-0.4 ## Evaluation * **Task(s):** Classification, KNN, Clustering, Robustness * **Metrics:** Accuracy, Precision, Recall, F1 * **Dataset(s):** Neuro Path dataset * **Results:** The model achieved strong performance across multiple evaluation methods using the Neuro Path dataset. The model architecture is based on facebook/dinov2-giant. **Linear Probe Performance:** - Accuracy: 80.17% - Precision: 79.20% - Recall: 79.60% - F1 Score: 77.88% **K-Nearest Neighbors Classification:** - Accuracy: 83.76% - Precision: 83.34% - Recall: 83.76% - F1 Score: 83.40% **Clustering Quality:** - Silhouette Score: 0.267 - Adjusted Mutual Information: 0.473 **Robustness Score:** 0.574 **Overall Performance Score:** 0.646 ### Model Evaluation Radar Chart The radar chart provides a visual comparison of multiple models across several performance metrics. Each axis extending from the center represents a different metric. The farther a model's line is from the center along a particular axis, the better its score for that specific metric (assuming higher is better for the metric). **How to Interpret:** * **Axes:** Each spoke of the radar represents a distinct evaluation metric. * **Lines/Polygons:** Each colored line (forming a polygon) represents a different model. * **Performance:** A point on an axis closer to the outer edge indicates a higher score for that metric. * **Overall Comparison:** By comparing the shapes and sizes of the polygons, you can get a quick visual understanding of the strengths and weaknesses of each model relative to others. A larger overall polygon generally suggests better all-around performance on the displayed metrics. --- ### Evaluation Tests Displayed: The chart displays results from several standard evaluation tests if their metrics are present in the `evaluation_results.json` files. The script also automatically discovers and plots other custom numeric metrics found within the "components" section of your JSON files. #### 1. Linear Probe * **What it is:** This test evaluates the quality of the model's learned features (embeddings). A simple linear classifier is trained on top of these frozen features to perform a classification task. * **Purpose:** It assesses how well the learned representations can be used for downstream tasks with a minimal amount of additional training. Good performance indicates that the embeddings are linearly separable and capture meaningful information. * **Common Metrics:** Accuracy, Precision, Recall, F1-Score (calculated for the linear classifier). #### 2. K-Nearest Neighbors (KNN) Evaluation * **What it is:** This test also evaluates the quality of the model's embeddings. Instead of training a new classifier, it uses the K-Nearest Neighbors algorithm directly on the embeddings to make predictions. For a given data point, its class is determined by the majority class among its 'k' closest neighbors in the embedding space. * **Purpose:** It assesses the local structure and similarity relationships within the embedding space. Good KNN performance suggests that similar items are close to each other in the learned representation. * **Common Metrics:** Accuracy, Precision, Recall, F1-Score (calculated for the KNN classifier). #### 3. Clustering * **What it is:** This set of tests evaluates how well the model's embeddings can naturally group similar items together without predefined labels (unsupervised). Algorithms like K-Means are often used to partition the data points based on their embeddings. * **Purpose:** It assesses the intrinsic structure and separability of the learned representations into meaningful groups. * **Common Metrics:** * **Silhouette Score:** Measures how similar an object is to its own cluster compared to other clusters. Ranges from -1 to 1 (higher is better). * **Adjusted Mutual Information (AMI):** Measures the agreement between true labels (if available) and clustering assignments, adjusted for chance. Ranges from 0 to 1 (higher is better). #### 4. Robustness * **What it is:** This is a general category of tests designed to measure how well a model maintains its performance when faced with various challenges or changes in the input data. * **Purpose:** It assesses the model's stability and reliability under non-ideal conditions. * **Examples of Challenges:** This can include noisy data, adversarial attacks (inputs intentionally designed to fool the model), out-of-distribution samples (data different from what the model was trained on), or other perturbations. * **Common Metrics:** Often a "Robustness Score" is reported, which could be an accuracy, F1-score, or other relevant metric evaluated on the challenged dataset. The specific calculation depends on the nature of the robustness test. (Higher is generally better). --- **Custom Metrics:** The radar chart will also display any other top-level numeric metrics or metrics nested one level deep within dictionaries found under the `"components"` key in your `evaluation_results.json` files. The names for these custom metrics on the chart will be derived from their keys in the JSON file (e.g., a key `inference_time_ms` would appear as `Inference time ms`). ## Ethical Considerations * **Data Usage:** * [PLACEHOLDER: E.g., "The data from the University of Kentucky used for training and evaluating this model was collected and utilized under Institutional Review Board (IRB) protocol #[XYZ] at the University of Kentucky.", "All data was de-identified prior to its use in this research in accordance with IRB-approved procedures and applicable privacy regulations (e.g., HIPAA)."] * **Patient Privacy:** * [PLACEHOLDER: E.g., "Measures were taken to ensure de-identification of patient data. The model outputs do not contain personally identifiable information."] * **Intended Use Context:** * This model is intended for research purposes to augment the capabilities of neuropathology researchers. It is not a medical device and should not be used for direct clinical decision-making, diagnosis, or treatment planning without comprehensive validation, regulatory approval (if applicable), and oversight by qualified medical professionals. * **Fairness and Bias Mitigation:** * [PLACEHOLDER: Describe any steps taken during development to assess or mitigate bias, or plans for future work in this area. E.g., "Ongoing work includes evaluating model performance across different demographic subgroups represented in the University of Kentucky dataset to identify and address potential disparities."] ## Contact For any additional questions or comments, contact CAAI (`ai@uky.edu`), Mahmut Gokmen (`m.gokmen@uky.edu`) Cody Bumgardner (`cody@uky.edu`). ## Citation / BibTeX TBD