Update README.md
Browse files
README.md
CHANGED
@@ -52,16 +52,14 @@ This model is a Vision Transformer adapted for neuropathology tasks, developed u
|
|
52 |
## Model Details
|
53 |
|
54 |
* **Model Type:** Vision Transformer (ViT) for neuropathology.
|
55 |
-
* **Developed by:**
|
56 |
-
* **Model Date:**
|
57 |
-
* **Base Model Architecture
|
58 |
-
* **Input:** Image (
|
59 |
-
* **
|
60 |
-
* **
|
61 |
-
* **Patch Size:** [PLACEHOLDER: e.g., 14 or 16. Confirm based on your model, e.g., "14 for a ViT with patch size 14."]
|
62 |
* **Image Size Compatibility:**
|
63 |
-
* The model was trained on images/patches of size
|
64 |
-
* For an input of [PLACEHOLDER: e.g., 224x224] with a patch size of [PLACEHOLDER: e.g., 14], this results in 1 class token + ([PLACEHOLDER: e.g., 224]/[PLACEHOLDER: e.g., 14])^2 = [PLACEHOLDER: e.g., 256] patch tokens [Optional: + X register tokens].
|
65 |
* The model can accept larger images provided the image dimensions are multiples of the patch size. If not, cropping to the closest smaller multiple may occur.
|
66 |
* **License:** [PLACEHOLDER: Reiterate license chosen in YAML, e.g., Apache 2.0. Add link to full license if custom or 'other'.]
|
67 |
* **Repository:** [PLACEHOLDER: Link to your model repository (e.g., GitHub, Hugging Face Hub)]
|
@@ -92,101 +90,134 @@ This model is intended for research purposes in the field of neuropathology.
|
|
92 |
|
93 |
## How to Get Started with the Model
|
94 |
|
95 |
-
[PLACEHOLDER: Provide code snippets for loading and using your model. If available on Hugging Face, show an example using `transformers` or `torch.hub.load`.]
|
96 |
|
97 |
-
|
98 |
```python
|
99 |
-
# Ensure you have the necessary libraries installed:
|
100 |
-
# pip install transformers torch Pillow
|
101 |
|
102 |
-
from transformers import AutoImageProcessor, AutoModel # Or AutoModelForImageClassification
|
103 |
import torch
|
104 |
from PIL import Image
|
105 |
-
|
106 |
-
|
107 |
-
|
108 |
-
|
109 |
-
|
110 |
-
|
111 |
-
|
112 |
-
|
113 |
-
|
114 |
-
|
115 |
-
|
116 |
-
|
117 |
-
|
118 |
-
|
119 |
-
|
120 |
-
|
121 |
-
#
|
122 |
-
|
123 |
-
|
124 |
-
|
125 |
-
|
126 |
-
|
127 |
-
|
128 |
-
|
129 |
-
|
130 |
-
|
131 |
-
|
132 |
-
def __call__(self, images, return_tensors=None):
|
133 |
-
# Simplified dummy preprocessing
|
134 |
-
return {"pixel_values": torch.randn(1, 3, self.size['height'], self.size['width'])}
|
135 |
-
image_processor = DummyProcessor()
|
136 |
-
|
137 |
-
|
138 |
-
# Example: Load an image
|
139 |
-
# Option 1: From a local path
|
140 |
-
image_path = "[PLACEHOLDER: path/to/your/neuropathology_image.png]"
|
141 |
-
# Option 2: From a URL (example)
|
142 |
-
# image_url = "[https://placehold.co/224x224/E6E6FA/800080?text=Sample](https://placehold.co/224x224/E6E6FA/800080?text=Sample)\nImage" # Lilac background, purple text
|
143 |
-
image_url = "[https://placehold.co/224x224/cccccc/333333?text=Sample+Patch](https://placehold.co/224x224/cccccc/333333?text=Sample+Patch)"
|
144 |
-
|
145 |
-
|
146 |
-
try:
|
147 |
-
# image = Image.open(image_path).convert("RGB")
|
148 |
-
# Uncomment above line and comment below if using local path
|
149 |
-
image = Image.open(requests.get(image_url, stream=True).raw).convert("RGB")
|
150 |
-
except FileNotFoundError:
|
151 |
-
print(f"Image file not found at: {image_path}. Using a dummy image.")
|
152 |
-
image = Image.new('RGB', (image_processor.size['height'], image_processor.size['width']), color = 'skyblue')
|
153 |
-
except Exception as e:
|
154 |
-
print(f"Error loading image: {e}. Using a dummy image.")
|
155 |
-
image = Image.new('RGB', (224, 224), color = 'skyblue') # Fallback size
|
156 |
-
|
157 |
-
# Preprocess the image
|
158 |
-
try:
|
159 |
-
inputs = image_processor(images=image, return_tensors="pt")
|
160 |
-
except Exception as e:
|
161 |
-
print(f"Error during image processing: {e}")
|
162 |
-
inputs = {"pixel_values": torch.randn(1, 3, 224, 224)} # Fallback input
|
163 |
-
|
164 |
-
# Perform inference
|
165 |
-
with torch.no_grad():
|
166 |
-
try:
|
167 |
outputs = model(**inputs)
|
168 |
-
|
169 |
-
|
170 |
-
|
171 |
-
|
172 |
-
|
173 |
-
|
174 |
-
|
175 |
-
|
176 |
-
|
177 |
-
|
178 |
-
|
179 |
-
|
180 |
-
|
181 |
-
|
182 |
-
|
183 |
-
|
184 |
-
|
185 |
-
|
186 |
-
|
187 |
-
|
188 |
-
|
189 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
190 |
```
|
191 |
|
192 |
## Training Data
|
|
|
52 |
## Model Details
|
53 |
|
54 |
* **Model Type:** Vision Transformer (ViT) for neuropathology.
|
55 |
+
* **Developed by:** Center for Applied Artificial Intelligence
|
56 |
+
* **Model Date:** 05/05/2025
|
57 |
+
* **Base Model Architecture :** DINOv2-Giant (Vit-G/14)
|
58 |
+
* **Input:** Image (224x224).
|
59 |
+
* **Embedding Dimension:** 1536
|
60 |
+
* **Patch Size:** 14
|
|
|
61 |
* **Image Size Compatibility:**
|
62 |
+
* The model was trained on images/patches of size 224x224.
|
|
|
63 |
* The model can accept larger images provided the image dimensions are multiples of the patch size. If not, cropping to the closest smaller multiple may occur.
|
64 |
* **License:** [PLACEHOLDER: Reiterate license chosen in YAML, e.g., Apache 2.0. Add link to full license if custom or 'other'.]
|
65 |
* **Repository:** [PLACEHOLDER: Link to your model repository (e.g., GitHub, Hugging Face Hub)]
|
|
|
90 |
|
91 |
## How to Get Started with the Model
|
92 |
|
|
|
93 |
|
94 |
+
This model can extract embeddings from pathology images using three different approaches: with an image processor for standardized preprocessing, without explicit resizing for preserving original image dimensions, or with forced 224×224 resizing for consistent inputs. These flexible extraction methods accommodate various usage scenarios while ensuring proper normalization, allowing researchers to choose the approach that best fits their specific data characteristics and research requirements.
|
95 |
```python
|
|
|
|
|
96 |
|
|
|
97 |
import torch
|
98 |
from PIL import Image
|
99 |
+
from transformers import AutoModel, AutoImageProcessor
|
100 |
+
from torchvision import transforms
|
101 |
+
|
102 |
+
def get_embeddings_with_processor(image_path, model_path, processor_path):
|
103 |
+
"""
|
104 |
+
Extract embeddings using a HuggingFace image processor.
|
105 |
+
This approach handles normalization and resizing automatically.
|
106 |
+
|
107 |
+
Args:
|
108 |
+
image_path: Path to the image file
|
109 |
+
model_path: Path to the model directory
|
110 |
+
processor_path: Path to the processor config directory
|
111 |
+
|
112 |
+
Returns:
|
113 |
+
Image embeddings from the model
|
114 |
+
"""
|
115 |
+
# Load model
|
116 |
+
model = AutoModel.from_pretrained(model_path)
|
117 |
+
model.eval()
|
118 |
+
|
119 |
+
# Load processor from config
|
120 |
+
image_processor = AutoImageProcessor.from_pretrained(processor_path)
|
121 |
+
|
122 |
+
# Process the image
|
123 |
+
with torch.no_grad():
|
124 |
+
image = Image.open(image_path).convert('RGB')
|
125 |
+
inputs = image_processor(images=image, return_tensors="pt")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
126 |
outputs = model(**inputs)
|
127 |
+
embeddings = outputs.last_hidden_state[:, 0, :]
|
128 |
+
|
129 |
+
return embeddings
|
130 |
+
|
131 |
+
def get_embeddings_direct(image_path, model_path, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]):
|
132 |
+
"""
|
133 |
+
Extract embeddings directly without an image processor.
|
134 |
+
This approach works with various image resolutions since transformers handle
|
135 |
+
different input sizes by design.
|
136 |
+
|
137 |
+
Args:
|
138 |
+
image_path: Path to the image file
|
139 |
+
model_path: Path to the model directory
|
140 |
+
mean: Normalization mean values
|
141 |
+
std: Normalization standard deviation values
|
142 |
+
|
143 |
+
Returns:
|
144 |
+
Image embeddings from the model
|
145 |
+
"""
|
146 |
+
# Load model
|
147 |
+
model = AutoModel.from_pretrained(model_path)
|
148 |
+
model.eval()
|
149 |
+
|
150 |
+
# Define transformation - just converting to tensor and normalizing
|
151 |
+
transform = transforms.Compose([
|
152 |
+
transforms.ToTensor(),
|
153 |
+
transforms.Normalize(mean=mean, std=std)
|
154 |
+
])
|
155 |
+
|
156 |
+
# Process the image
|
157 |
+
with torch.no_grad():
|
158 |
+
# Open image and convert to RGB
|
159 |
+
image = Image.open(image_path).convert('RGB')
|
160 |
+
# Convert image to tensor
|
161 |
+
image_tensor = transform(image).unsqueeze(0) # Add batch dimension
|
162 |
+
# Feed to model
|
163 |
+
outputs = model(pixel_values=image_tensor)
|
164 |
+
# Get embeddings
|
165 |
+
embeddings = outputs.last_hidden_state[:, 0, :]
|
166 |
+
|
167 |
+
return embeddings
|
168 |
+
|
169 |
+
def get_embeddings_resized(image_path, model_path, size=(224, 224), mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]):
|
170 |
+
"""
|
171 |
+
Extract embeddings with explicit resizing to 224x224.
|
172 |
+
This approach ensures consistent input size regardless of original image dimensions.
|
173 |
+
|
174 |
+
Args:
|
175 |
+
image_path: Path to the image file
|
176 |
+
model_path: Path to the model directory
|
177 |
+
size: Target size for resizing (default: 224x224)
|
178 |
+
mean: Normalization mean values
|
179 |
+
std: Normalization standard deviation values
|
180 |
+
|
181 |
+
Returns:
|
182 |
+
Image embeddings from the model
|
183 |
+
"""
|
184 |
+
# Load model
|
185 |
+
model = AutoModel.from_pretrained(model_path)
|
186 |
+
model.eval()
|
187 |
+
|
188 |
+
# Define transformation with explicit resize
|
189 |
+
transform = transforms.Compose([
|
190 |
+
transforms.Resize(size, interpolation=transforms.InterpolationMode.BICUBIC),
|
191 |
+
transforms.ToTensor(),
|
192 |
+
transforms.Normalize(mean=mean, std=std)
|
193 |
+
])
|
194 |
+
|
195 |
+
# Process the image
|
196 |
+
with torch.no_grad():
|
197 |
+
image = Image.open(image_path).convert('RGB')
|
198 |
+
image_tensor = transform(image).unsqueeze(0) # Add batch dimension
|
199 |
+
outputs = model(pixel_values=image_tensor)
|
200 |
+
embeddings = outputs.last_hidden_state[:, 0, :]
|
201 |
+
|
202 |
+
return embeddings
|
203 |
+
|
204 |
+
# Example usage
|
205 |
+
if __name__ == "__main__":
|
206 |
+
image_path = "test.jpg"
|
207 |
+
model_path = "outputs/training_test_3/teacher_checkpoints/iter_40"
|
208 |
+
processor_path = "processor_config.json" # Directory containing preprocessor_config.json
|
209 |
+
|
210 |
+
# Method 1: Using image processor (recommended for consistency)
|
211 |
+
embeddings1 = get_embeddings_with_processor(image_path, model_path, processor_path)
|
212 |
+
print('Embedding shape (with processor):', embeddings1.shape)
|
213 |
+
|
214 |
+
# Method 2: Direct approach without resizing (works with various resolutions)
|
215 |
+
embeddings2 = get_embeddings_direct(image_path, model_path)
|
216 |
+
print('Embedding shape (direct):', embeddings2.shape)
|
217 |
+
|
218 |
+
# Method 3: With explicit resize to 224x224
|
219 |
+
embeddings3 = get_embeddings_resized(image_path, model_path)
|
220 |
+
print('Embedding shape (resized):', embeddings3.shape)
|
221 |
```
|
222 |
|
223 |
## Training Data
|