|
--- |
|
title: README |
|
emoji: 🏢 |
|
colorFrom: blue |
|
colorTo: blue |
|
sdk: static |
|
pinned: false |
|
--- |
|
# ZA-African-Next-Voices (za-african-next-voices) |
|
|
|
**Purpose:** |
|
This organization was created to manage and share datasets and models for the South African component of the African Next Voices (ANV) project. |
|
If you’re looking for all datasets, models, or work from our broader team, visit our main org: [DSFSI](https://huggingface.co/dsfsi). |
|
|
|
## About the South African Next Voices Project |
|
|
|
**ZA-ANV** is building a **3,000-hour** multilingual, multi-domain speech dataset for South Africa, spanning seven local languages. |
|
- **Languages:** Setswana, isiZulu, isiXhosa, Sesotho, Sepedi, isiNdebele, Tshivenda |
|
- **Coverage:** 500 hours per language for the main five; 250 hours for isiNdebele and Tshivenda (pilot/experimental scale for future work) |
|
- **Domains:** Broad/general domains to reflect real-world diversity |
|
- **Goal:** Enable robust speech and language technology for local South African languages, break literacy barriers, and make digital content locally relevant. |
|
|
|
# About DSFSI |
|
|
|
**Data Science for Social Impact (DSFSI)** is a research group at the Computer Science Department, University of Pretoria. |
|
We work at the intersection of **Data Science for Society** and **Local Language NLP**. |
|
|
|
Our mission: |
|
|
|
*Data-driven collaborative innovation to empower society to tackle challenges and preserve our languages.* |
|
|
|
Find all our work and resources at: [huggingface.co/dsfsi](https://huggingface.co/dsfsi) |
|
|
|
**Questions?** |
|
Contact us via our [DSFSI website](https://www.dsfsi.co.za) or through the main [DSFSI Hugging Face org](https://huggingface.co/dsfsi). |