prithivMLmods commited on
Commit
dcf567a
·
verified ·
1 Parent(s): 2f80193

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +107 -0
README.md CHANGED
@@ -2,8 +2,31 @@
2
  license: apache-2.0
3
  datasets:
4
  - prithivMLmods/Shoe-Net-10K
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  ---
6
 
 
 
 
 
 
 
 
 
 
 
7
  ```py
8
  Classification Report:
9
  precision recall f1-score support
@@ -20,3 +43,87 @@ weighted avg 0.9202 0.9197 0.9194 10000
20
  ```
21
 
22
  ![download.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/e5c_wP09atj7GhXoxUnHW.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  datasets:
4
  - prithivMLmods/Shoe-Net-10K
5
+ language:
6
+ - en
7
+ base_model:
8
+ - google/siglip2-base-patch16-512
9
+ pipeline_tag: image-classification
10
+ library_name: transformers
11
+ tags:
12
+ - SigLIP2
13
+ - Ballet Flat
14
+ - Boat
15
+ - Sneaker
16
+ - Clog
17
+ - Brogue
18
  ---
19
 
20
+ ![44.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/_6WAhmO9_W74Sz2AhwytE.png)
21
+
22
+ # shoe-type-detection
23
+
24
+ > shoe-type-detection is a vision-language encoder model fine-tuned from `google/siglip2-base-patch16-512` for **multi-class image classification**. It is trained to detect different types of shoes such as **Ballet Flats**, **Boat Shoes**, **Brogues**, **Clogs**, and **Sneakers**. The model uses the `SiglipForImageClassification` architecture.
25
+
26
+ > \[!note]
27
+ > SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
28
+ > [https://arxiv.org/pdf/2502.14786](https://arxiv.org/pdf/2502.14786)
29
+
30
  ```py
31
  Classification Report:
32
  precision recall f1-score support
 
43
  ```
44
 
45
  ![download.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/e5c_wP09atj7GhXoxUnHW.png)
46
+
47
+ ---
48
+
49
+ ## Label Space: 5 Classes
50
+
51
+ ```
52
+ Class 0: Ballet Flat
53
+ Class 1: Boat
54
+ Class 2: Brogue
55
+ Class 3: Clog
56
+ Class 4: Sneaker
57
+ ```
58
+
59
+ ---
60
+
61
+ ## Install Dependencies
62
+
63
+ ```bash
64
+ pip install -q transformers torch pillow gradio hf_xet
65
+ ```
66
+
67
+ ---
68
+
69
+ ## Inference Code
70
+
71
+ ```python
72
+ import gradio as gr
73
+ from transformers import AutoImageProcessor, SiglipForImageClassification
74
+ from PIL import Image
75
+ import torch
76
+
77
+ # Load model and processor
78
+ model_name = "prithivMLmods/shoe-type-detection" # Update with actual model name on Hugging Face
79
+ model = SiglipForImageClassification.from_pretrained(model_name)
80
+ processor = AutoImageProcessor.from_pretrained(model_name)
81
+
82
+ # Updated label mapping
83
+ id2label = {
84
+ "0": "Ballet Flat",
85
+ "1": "Boat",
86
+ "2": "Brogue",
87
+ "3": "Clog",
88
+ "4": "Sneaker"
89
+ }
90
+
91
+ def classify_image(image):
92
+ image = Image.fromarray(image).convert("RGB")
93
+ inputs = processor(images=image, return_tensors="pt")
94
+
95
+ with torch.no_grad():
96
+ outputs = model(**inputs)
97
+ logits = outputs.logits
98
+ probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
99
+
100
+ prediction = {
101
+ id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
102
+ }
103
+
104
+ return prediction
105
+
106
+ # Gradio Interface
107
+ iface = gr.Interface(
108
+ fn=classify_image,
109
+ inputs=gr.Image(type="numpy"),
110
+ outputs=gr.Label(num_top_classes=5, label="Shoe Type Classification"),
111
+ title="Shoe Type Detection",
112
+ description="Upload an image of a shoe to classify it as Ballet Flat, Boat, Brogue, Clog, or Sneaker."
113
+ )
114
+
115
+ if __name__ == "__main__":
116
+ iface.launch()
117
+ ```
118
+
119
+ ---
120
+
121
+ ## Intended Use
122
+
123
+ `shoe-type-detection` is designed for:
124
+
125
+ * **E-Commerce Automation** – Automate product tagging and classification in online retail platforms.
126
+ * **Footwear Inventory Management** – Efficiently organize and categorize large volumes of shoe images.
127
+ * **Retail Intelligence** – Enable AI-powered search and filtering based on shoe types.
128
+ * **Smart Surveillance** – Identify and analyze footwear types in surveillance footage for retail analytics.
129
+ * **Fashion and Apparel Research** – Analyze trends in shoe types and customer preferences using image datasets.