mini-gte (ONNX Quantized)

A lightweight, optimized version of the gte-small model for client-side inference (browser/edge). Exported to ONNX for compatibility with ONNX.js, Transformers.js, and other edge-friendly runtimes.

πŸš€ Features

  • ONNX Format: Ready for browser/edge deployment.
  • Quantized: Smaller size (~45MB) with minimal accuracy loss.
  • Sentence Embeddings: Generate embeddings for semantic search, clustering, etc.

πŸ“¦ Files

model/
β”œβ”€β”€ config.json
β”œβ”€β”€ model.onnx
β”œβ”€β”€ tokenizer_config.json
β”œβ”€β”€ special_tokens_map.json
└── vocab.txt
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support