README / README.md
vukosi's picture
Update README.md
503b27d verified
metadata
title: README
emoji: 🏢
colorFrom: blue
colorTo: blue
sdk: static
pinned: false

ZA-African-Next-Voices (za-african-next-voices)

Purpose:
This organization was created to manage and share datasets and models for the South African component of the African Next Voices (ANV) project.
If you’re looking for all datasets, models, or work from our broader team, visit our main org: DSFSI.

About the South African Next Voices Project

ZA-ANV is building a 3,000-hour multilingual, multi-domain speech dataset for South Africa, spanning seven local languages.

  • Languages: Setswana, isiZulu, isiXhosa, Sesotho, Sepedi, isiNdebele, Tshivenda
  • Coverage: 500 hours per language for the main five; 250 hours for isiNdebele and Tshivenda (pilot/experimental scale for future work)
  • Domains: Broad/general domains to reflect real-world diversity
  • Goal: Enable robust speech and language technology for local South African languages, break literacy barriers, and make digital content locally relevant.

About DSFSI

Data Science for Social Impact (DSFSI) is a research group at the Computer Science Department, University of Pretoria. We work at the intersection of Data Science for Society and Local Language NLP.

Our mission:

Data-driven collaborative innovation to empower society to tackle challenges and preserve our languages.

Find all our work and resources at: huggingface.co/dsfsi

Questions?
Contact us via our DSFSI website or through the main DSFSI Hugging Face org.