|
--- |
|
license: mit |
|
datasets: |
|
- alfredplpl/Japanese-photos |
|
- 3sara/colpali_italian_documents |
|
pipeline_tag: image-classification |
|
tags: |
|
- image-classification |
|
- mobile |
|
- tablet |
|
- quantization |
|
- onnx |
|
- mobilenetv3 |
|
- mobilenet_v3 |
|
- mobilenetv3_onnx |
|
- document-classification |
|
- photo-classification |
|
- real-time |
|
- lightweight |
|
- efficient |
|
- document |
|
- photo |
|
- images |
|
- q8 |
|
- int8 |
|
- edge-ai |
|
- ai-on-device |
|
- offline |
|
- privacy |
|
- fast |
|
- android |
|
- ios |
|
- gallery |
|
--- |
|
|
|
# MobileNetV3 β ONNX, Quantized |
|
|
|
### π₯ Lightweight mobile model for **image classification** into two categories: |
|
- **`document`** (scans, receipts, papers, invoices) |
|
- **`photo`** (regular phone photos: scenes, people, nature, etc.) |
|
|
|
--- |
|
|
|
## π’ Overview |
|
|
|
- **Designed for mobile devices** (phones and tablets, Android/iOS), perfect for real-time on-device inference! |
|
- Architecture: **MobileNetV2** |
|
- Format: **ONNX** (both float32 and quantized int8 versions included) |
|
- Trained on balanced, real-world open-source datasets for both documents and photos. |
|
- Ideal for tasks like: |
|
- Document detection in gallery/camera rolls |
|
- Screenshot, receipt, photo, and PDF preview classification |
|
- Image sorting for privacy-first offline AI assistants |
|
|
|
--- |
|
|
|
## π·οΈ Model Classes |
|
- **0** β `document` |
|
- **1** β `photo` |
|
|
|
--- |
|
|
|
## β‘οΈ Versions |
|
|
|
- `mobilenet_v3_small.onnx` β Standard float32 for maximum accuracy (best for ARM/CPU) |
|
- `mobilenet_v3_small_quant.onnx` β Quantized int8 for even faster inference and smaller file size (best for low-power or edge devices) |
|
|
|
--- |
|
|
|
## π Why this model? |
|
|
|
- **Ultra-small size** (~10-15MB), real-time inference (<100ms) on most phones |
|
- **Runs 100% offline** (privacy, no cloud required) |
|
- **Easy integration** with any framework, including React Native (`onnxruntime-react-native`), Android (ONNX Runtime), and iOS. |
|
|
|
--- |
|
|
|
## ποΈ Datasets |
|
|
|
- **Photos:** [alfredplpl/Japanese-photos](https://huggingface.co/datasets/alfredplpl/Japanese-photos) |
|
- **Documents:** [3sara/colpali_italian_documents](https://huggingface.co/datasets/3sara/colpali_italian_documents) |
|
|
|
--- |
|
|
|
## π€ Author |
|
@vlad-m-dev |
|
Built for edge-ai/phone/tablet offline image classification: document vs photo |
|
Telegram: https://t.me/dwight_schrute_engineer |
|
|
|
--- |
|
|
|
## π οΈ Usage Example |
|
|
|
```python |
|
import onnxruntime as ort |
|
import numpy as np |
|
|
|
session = ort.InferenceSession(MODEL_PATH) |
|
img = np.random.randn(1, 3, 224, 224).astype(np.float32) # Replace with your image preprocessing! |
|
output = session.run(None, {"input": img}) |
|
pred_class = np.argmax(output[0]) |
|
print(pred_class) # 0 = document, 1 = photo``` |
|
|