Khasi-English Semantic Search Model

First production-ready semantic search model for Khasi-English language pairs.

Overview

This model enables semantic search between English and Khasi languages, supporting Northeast India's linguistic diversity. Trained on 66,794 English-Khasi translation pairs.

Use Cases

  • Cross-lingual semantic search (English โ†” Khasi)
  • Document similarity in bilingual contexts
  • Cultural content discovery for Northeast India
  • Educational language learning tools

Performance

  • English-Khasi similarity: 0.69-0.74
  • Model size: ~90MB (lightweight deployment)
  • 384-dimensional embeddings

Quick Start

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('MWirelabs/khasi-english-semantic-search')
sentences = ['Hello', 'hangne', 'Good morning']
embeddings = model.encode(sentences)

Developed by MWirelabs for Northeast India AI innovation.

Downloads last month
11
Safetensors
Model size
22.7M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support