mini-gte (ONNX Quantized)

A lightweight, optimized version of the gte-small model for client-side inference (browser/edge). Exported to ONNX for compatibility with ONNX.js, Transformers.js, and other edge-friendly runtimes.

🚀 Features

ONNX Format: Ready for browser/edge deployment.
Quantized: Smaller size (~45MB) with minimal accuracy loss.
Sentence Embeddings: Generate embeddings for semantic search, clustering, etc.

📦 Files

model/
├── config.json
├── model.onnx
├── tokenizer_config.json
├── special_tokens_map.json
└── vocab.txt