--- license: mit datasets: - alfredplpl/Japanese-photos - 3sara/colpali_italian_documents pipeline_tag: image-classification tags: - image-classification - mobile - tablet - quantization - onnx - mobilenetv3 - mobilenet_v3 - mobilenetv3_onnx - document-classification - photo-classification - real-time - lightweight - efficient - document - photo - images - q8 - int8 - edge-ai - ai-on-device - offline - privacy - fast - android - ios - gallery --- # MobileNetV3 — ONNX, Quantized ### 🔥 Lightweight mobile model for **image classification** into two categories: - **`document`** (scans, receipts, papers, invoices) - **`photo`** (regular phone photos: scenes, people, nature, etc.) --- ## 🟢 Overview - **Designed for mobile devices** (phones and tablets, Android/iOS), perfect for real-time on-device inference! - Architecture: **MobileNetV2** - Format: **ONNX** (both float32 and quantized int8 versions included) - Trained on balanced, real-world open-source datasets for both documents and photos. - Ideal for tasks like: - Document detection in gallery/camera rolls - Screenshot, receipt, photo, and PDF preview classification - Image sorting for privacy-first offline AI assistants --- ## 🏷️ Model Classes - **0** — `document` - **1** — `photo` --- ## ⚡️ Versions - `mobilenet_v3_small.onnx` — Standard float32 for maximum accuracy (best for ARM/CPU) - `mobilenet_v3_small_quant.onnx` — Quantized int8 for even faster inference and smaller file size (best for low-power or edge devices) --- ## 🚀 Why this model? - **Ultra-small size** (~10-15MB), real-time inference (<100ms) on most phones - **Runs 100% offline** (privacy, no cloud required) - **Easy integration** with any framework, including React Native (`onnxruntime-react-native`), Android (ONNX Runtime), and iOS. --- ## 🗃️ Datasets - **Photos:** [alfredplpl/Japanese-photos](https://huggingface.co/datasets/alfredplpl/Japanese-photos) - **Documents:** [3sara/colpali_italian_documents](https://huggingface.co/datasets/3sara/colpali_italian_documents) --- ## 🤖 Author @vlad-m-dev Built for edge-ai/phone/tablet offline image classification: document vs photo Telegram: https://t.me/dwight_schrute_engineer --- ## 🛠️ Usage Example ```python import onnxruntime as ort import numpy as np session = ort.InferenceSession(MODEL_PATH) img = np.random.randn(1, 3, 224, 224).astype(np.float32) # Replace with your image preprocessing! output = session.run(None, {"input": img}) pred_class = np.argmax(output[0]) print(pred_class) # 0 = document, 1 = photo```