mini-gte (ONNX Quantized)
A lightweight, optimized version of the gte-small
model for client-side inference (browser/edge). Exported to ONNX for compatibility with ONNX.js, Transformers.js, and other edge-friendly runtimes.
π Features
- ONNX Format: Ready for browser/edge deployment.
- Quantized: Smaller size (~45MB) with minimal accuracy loss.
- Sentence Embeddings: Generate embeddings for semantic search, clustering, etc.
π¦ Files
model/
βββ config.json
βββ model.onnx
βββ tokenizer_config.json
βββ special_tokens_map.json
βββ vocab.txt
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support