f/n.png

Food-or-Not-SigLIP2

Food-or-Not-SigLIP2 is a vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for binary image classification. It is trained to distinguish between images of food and non-food objects using the SiglipForImageClassification architecture.

Classification Report:
              precision    recall  f1-score   support

        food     0.8902    0.8610    0.8753      4000
    not-food     0.8654    0.8938    0.8794      4000

    accuracy                         0.8774      8000
   macro avg     0.8778    0.8774    0.8773      8000
weighted avg     0.8778    0.8774    0.8773      8000

download.png


Label Space: 2 Classes

The model classifies each image into one of the following categories:

Class 0: "food"
Class 1: "not-food"

Install Dependencies

pip install -q transformers torch pillow gradio

Inference Code

import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from PIL import Image
import torch

# Load model and processor
model_name = "prithivMLmods/Food-or-Not-SigLIP2"  # Replace with your model path if different
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)

# Label mapping
id2label = {
    "0": "food",
    "1": "not-food"
}

def classify_food(image):
    image = Image.fromarray(image).convert("RGB")
    inputs = processor(images=image, return_tensors="pt")

    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits
        probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()

    prediction = {
        id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
    }

    return prediction

# Gradio Interface
iface = gr.Interface(
    fn=classify_food,
    inputs=gr.Image(type="numpy"),
    outputs=gr.Label(num_top_classes=2, label="Food Classification"),
    title="Food-or-Not-SigLIP2",
    description="Upload an image to detect if it contains food or not."
)

if __name__ == "__main__":
    iface.launch()

Intended Use

Food-or-Not-SigLIP2 can be used for:

  • Dietary Apps โ€“ Automatically classify images for food detection.
  • Retail & E-commerce โ€“ Filter food vs non-food products visually.
  • Content Moderation โ€“ Flag content containing food items.
  • Dataset Curation โ€“ Separate food-related images for training or filtering.
Downloads last month
10
Safetensors
Model size
92.9M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for prithivMLmods/Food-or-Not-SigLIP2

Finetuned
(99)
this model

Dataset used to train prithivMLmods/Food-or-Not-SigLIP2

Collection including prithivMLmods/Food-or-Not-SigLIP2