Multiclass Image Classification 05142025
Collection
classification net.
•
20 items
•
Updated
•
2
AIorNot-SigLIP2 is a vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for binary image classification. It is trained to detect whether an image is generated by AI or is a real photograph using the SiglipForImageClassification architecture.
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features https://arxiv.org/pdf/2502.14786
Classification Report:
precision recall f1-score support
Real 0.9215 0.8842 0.9025 8288
AI 0.9100 0.9396 0.9246 10330
accuracy 0.9149 18618
macro avg 0.9158 0.9119 0.9135 18618
weighted avg 0.9151 0.9149 0.9147 18618
The model classifies an image as either:
Class 0: Real
Class 1: AI
pip install -q transformers torch pillow gradio hf_xet
import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from PIL import Image
import torch
# Load model and processor
model_name = "prithivMLmods/AIorNot-SigLIP2" # Replace with your model path
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)
# Label mapping
id2label = {
"0": "Real",
"1": "AI"
}
def classify_image(image):
image = Image.fromarray(image).convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
prediction = {
id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
}
return prediction
# Gradio Interface
iface = gr.Interface(
fn=classify_image,
inputs=gr.Image(type="numpy"),
outputs=gr.Label(num_top_classes=2, label="AI or Real Detection"),
title="AIorNot-SigLIP2",
description="Upload an image to classify whether it is AI-generated or Real."
)
if __name__ == "__main__":
iface.launch()
AIorNot-SigLIP2 is useful in scenarios such as:
Base model
google/siglip2-base-patch16-224