Update README.md
Browse files
README.md
CHANGED
@@ -1,10 +1,59 @@
|
|
1 |
---
|
2 |
-
title:
|
3 |
-
emoji:
|
4 |
-
colorFrom:
|
5 |
-
colorTo:
|
6 |
-
sdk: static
|
7 |
pinned: false
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
---
|
9 |
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
title: "MWire Labs"
|
3 |
+
emoji: "π‘"
|
4 |
+
colorFrom: "yellow"
|
5 |
+
colorTo: "gray"
|
6 |
+
sdk: "static"
|
7 |
pinned: false
|
8 |
+
license: "apache-2.0"
|
9 |
+
tags:
|
10 |
+
- AI
|
11 |
+
- NLP
|
12 |
+
- Multilingual
|
13 |
+
- Low-resource
|
14 |
+
- Khasi
|
15 |
+
- Northeast-India
|
16 |
+
- Governance
|
17 |
+
- Civic-Tech
|
18 |
---
|
19 |
|
20 |
+
# MWire Labs
|
21 |
+
|
22 |
+
MWire Labs is the applied AI research wing of MWire (Formerly known as Marketing Wire).
|
23 |
+
We build scalable, ethical AI systems for underrepresented languages and grassroots data β pioneering research from Northeast India with global relevance.
|
24 |
+
|
25 |
+
Our work bridges the gap between **local challenges and frontier AI**, creating models and datasets that empower governance, nonprofits, and rural innovation.
|
26 |
+
|
27 |
+
---
|
28 |
+
|
29 |
+
## Focus
|
30 |
+
|
31 |
+
- **Low-Resource NLP**: Building language models for Khasi and other underserved languages.
|
32 |
+
- **Grassroots Datasets**: Structuring data from villages, agriculture, and civic institutions.
|
33 |
+
- **Applied Civic AI**: Turning AI research into practical tools for governance and entrepreneurship.
|
34 |
+
|
35 |
+
---
|
36 |
+
|
37 |
+
## Flagship Work
|
38 |
+
|
39 |
+
- [KhasiBERT](https://huggingface.co/MWirelabs/khasibert) β the first Khasi language transformer model.
|
40 |
+
- [Khasi-English Semantic Search](https://huggingface.co/MWirelabs/khasi-english-semantic-search) β cross-lingual search for low-resource contexts.
|
41 |
+
- [Northeast India Districts & Villages Dataset](https://huggingface.co/datasets/MWirelabs/Northeast-India-Districts-and-Villages) β structured registry of 15,000+ villages and 100+ districts.
|
42 |
+
|
43 |
+
---
|
44 |
+
|
45 |
+
## Roadmap
|
46 |
+
|
47 |
+
- **NeoDAC Models** β lightweight, domain-aligned LLMs for governance and civic tech.
|
48 |
+
- **Expanded Multilingual Embeddings** β covering more Northeast Indian languages.
|
49 |
+
- **Open Agriculture & Rural Datasets** β unlocking insights for farmers, SHGs, and entrepreneurs.
|
50 |
+
- **Applied AI Pilots** β in partnership with nonprofits and government bodies.
|
51 |
+
|
52 |
+
---
|
53 |
+
|
54 |
+
## Why It Matters
|
55 |
+
|
56 |
+
Most AI research overlooks low-resource languages and rural contexts.
|
57 |
+
MWire Labs is proving that **AI can be built for the margins, not just the mainstream** β starting from the Northeast, and scaling to India and beyond.
|
58 |
+
|
59 |
+
---
|