MALIBA-TTS: Revolutionizing Speech Synthesis for Malian Languages 🇲🇱
MALIBA-TTS represents a breakthrough in African language technology, offering high-quality text-to-speech synthesis for six Malian languages. These models bridge a critical gap in speech technology, bringing voice synthesis capabilities to languages spoken by millions yet historically underserved by technology.
Try It Out
Experience MALIBA-TTS directly in your browser: Live Demo on Hugging Face Spaces
Bridging the Digital Language Divide
Despite being spoken by over 20 million people combined, Malian languages have remained severely underrepresented in speech technology. MALIBA-TTS directly addresses this critical gap, making digital speech interfaces accessible to speakers of Bambara, Boomu, Dogon, Pular, Songhoy, and Tamasheq for the first time. This work represents a crucial step toward digital language equality.
Table of Contents
- Try It Out
- Technical Specifications
- Transforming Access to Technology
- Installation
- Usage
- Language Examples
- The MALIBA-AI Impact
- Limitations
- Future Development
- References
- License
- Contributing
Technical Specifications
Model Specifications
- Architecture: VITS (Variational Inference with adversarial learning for end-to-end TTS)
- Base Model: Meta's MMS (Massively Multilingual Speech)
- Model Size: 145 MB per language
- Format: PyTorch
- Sampling Rate: 16kHz
- Audio Encoding: 16-bit PCM
- Languages: Bambara, Boomu, Dogon, Pular, Songhoy, and Tamasheq
Performance
- Inference: Optimized to run on CPU
- Real-time Capability: Generates speech with minimal latency
- Memory Footprint: ~4GB RAM recommended for optimal performance
- Deployment Flexibility: Works on standard hardware without specialized accelerators
Transforming Access to Technology in Mali
MALIBA-TTS enables numerous applications previously unavailable to speakers of Malian languages:
- Education: Audio-based learning tools for literacy and education in mother tongues
- Accessibility: Making digital content accessible to visually impaired users
- Healthcare: Voice interfaces for health information in local languages
- Cultural Preservation: Digital narration of stories and cultural heritage
- Mobile Access: Voice responses for smartphone users with limited literacy
- Public Service: Automated voice announcements and information systems
Installation
pip install transformers torch soundfile
Usage
import torch
import soundfile as sf
from transformers import VitsModel, AutoTokenizer
# Available languages: bambara, boomu, dogon, pular, songhoy, tamasheq
language = "bambara"
model_id = "MALIBA-AI/malian-tts"
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id, subfolder=f"models/{language}")
model = VitsModel.from_pretrained(model_id, subfolder=f"models/{language}")
# Set device
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)
# Synthesize speech
text = "An filɛ ni ye yɔrɔ minna ni an ye an sigi ka a layɛ yala an bɛ ka baara min kɛ ɛsike a kɛlen don ka Ɲɛ wa ?"
inputs = tokenizer(text, return_tensors="pt").to(device)
with torch.no_grad():
output = model(**inputs).waveform
waveform = output.squeeze().cpu().numpy()
sample_rate = model.config.sampling_rate
# Save to file
sf.write("output.wav", waveform, sample_rate)
Language Examples
# Bambara
text = "An filɛ ni ye yɔrɔ minna ni an ye an sigi ka a layɛ yala an bɛ ka baara min kɛ ɛsike a kɛlen don ka Ɲɛ wa ?"
# Boomu
text = "Vunurobe wozomɛ pɛɛ, Poli we zo woro han Deeɓenu wara li Deeɓenu faralo zuun. Lo we baba a lo wara yi see ɓa Zuwifera ma ɓa Gɛrɛkela wa."
# Dogon
text = "Pɔɔlɔ, kubɔ lugo joo le, bana dɛin dɛin le, inɛw Ama titiyaanw le digɛu, Ama, emɛ babe bɛrɛ sɔɔ sɔi."
# Pular
text = "Miɗo ndaarde saabe Laamɗo e saabe Iisaa Almasiihu caroyoowo wuurɓe e maayɓe oo, miɗo ndaardire saabe gartol makko ka num e Laamu makko"
# Songhoy
text = "Haya ka se beenediyo kokoyteraydi go hima nda huukoy foo ka fatta ja subaahi ka taasi goykoyyo ngu rezẽ faridi se"
# Tamasheq
text = "Toḍă tăfukt ɣas, issăɣră-dd măssi-s n-ašĕkrĕš ănaẓraf-net, inn'-as: 'Ǝɣĕr-dd inaxdimăn, tĕẓlĕd-asăn, sănt s-wi dd-ĕšrăynen har tĕkkĕd wi dd-ăzzarnen."
The MALIBA-AI Impact
MALIBA-TTS is part of MALIBA-AI's broader mission to ensure "No Malian Language Left Behind." This initiative is actively transforming Mali's digital landscape by:
- Breaking Language Barriers: Providing technology in languages that Malians actually speak
- Enabling Local Innovation: Allowing Malian developers to build voice-based applications
- Preserving Cultural Heritage: Digitizing and preserving Mali's rich oral traditions
- Democratizing AI: Making cutting-edge technology accessible to all Malians regardless of literacy level
- Building Local Expertise: Training Malian AI practitioners and researchers
Limitations
[coming soon]
Future Development
MALIBA-AI is committed to continuing this work with:
- Expansion to more Malian languages and dialects
References
@misc{malian-tts,
author = {MALIBA-AI},
title = {Text-to-Speech Models for Six Malian Languages},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/MALIBA-AI/malian-tts}}
}
@article{kim2021conditional,
title={Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech},
author={Kim, Jaehyeon and Kong, Jungil and Son, Juhee},
journal={International Conference on Machine Learning},
year={2021}
}
@article{meta2023mms,
title={Scaling Speech Technology to 1,000+ Languages},
author={A. Pratap and others},
journal={arXiv preprint arXiv:2305.13516},
year={2023}
}
License
This project is licensed under CC BY-NC 4.0 (Attribution-NonCommercial).
Terms of Use
- Users agree to use the model in a way that respects Malian languages and culture
- We encourage the use of these models to develop solutions that improve digital accessibility for speakers of Malian languages
- Any use of the models must acknowledge MALIBA-AI and Meta
- Commercial usage is not allowed
Contributing
MALIBA-TTS is a project part of the MALIBA-AI initiative with the mission "No Malian Language Left Behind." We welcome contributions from:
- Language Experts: To improve the quality and accuracy of the models
- Developers: To create applications using these models
- Researchers: To explore technical improvements and optimizations
- Data Contributors: To enrich tts training data
- Community Members: To provide feedback and testing across dialects
To contribute, please visit MALIBA-AI or contact [coming soon] directly.
MALIBA-AI: Empowering Mali's Future Through Community-Driven AI Innovation
"No Malian Language Left Behind"
Model tree for MALIBA-AI/malian-tts
Base model
facebook/mms-ttsSpace using MALIBA-AI/malian-tts 1
Paper for MALIBA-AI/malian-tts
Evaluation results
- Subjective Qualityself-reportedN/A