localenlp-en-wol

Fine-tuned MarianMT model for English-to-Wolof translation.

Model Card for `LOCALENLP/english-wolof`

This is a machine translation model for English → Wolof, developed by the LOCALENLP organization.
It is based on the pretrained Helsinki-NLP/opus-mt-en-mul MarianMT model and fine-tuned on a custom parallel corpus of ~84k sentence pairs.

Model Details

Model Description

Developed by: LOCALENLP
Funded by [optional]: N/A
Shared by: LOCALENLP
Model type: Seq2Seq Transformer (MarianMT)
Languages: English → Wolof
License: MIT
Finetuned from model: Helsinki-NLP/opus-mt-en-mul

Model Sources

Repository: https://huggingface.co/LOCALENLP/english-wolof
Demo [optional]: To be integrated in Gradio / Web app

Uses

Direct Use

Translate English text into Wolof for research, education, and communication.
Useful for low-resource NLP tasks, digital content creation, and cultural preservation.

Downstream Use

Can be integrated into translation apps, chatbots, and education platforms.
Serves as a base for further fine-tuning on domain-specific Wolof corpora.

Out-of-Scope Use

Suitable for legal and medical translations (e.g., contracts, prescriptions, medical records).
Mistranslations may occur, like any automated system.
Review recommended as the model can sometimes mistranslate.

Bias, Risks, and Limitations

Training data is from a custom collection of parallel sentences (~84k pairs).
Some informal or culturally nuanced expressions may not be accurately translated.
Wolof spelling and grammar variation (Latin script) may lead to inconsistencies.
Model may underperform on domain-specific or long, complex texts.

Recommendations

Use human post-editing for high-stakes use cases.
Evaluate performance on your target domain before deployment.

How to Get Started with the Model

from transformers import MarianTokenizer, AutoModelForSeq2SeqLM

model_name = "LOCALENLP/english-wolof"
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

text = "Good evening, how was your day?"
inputs = tokenizer(">>wol<< " + text, return_tensors="pt", padding=True, truncation=True)
outputs = model.generate(**inputs, max_length=512, num_beams=4)
translation = tokenizer.decode(outputs[0], skip_special_tokens=True)

print("English:", text)
print("Wolof:", translation)

LocaleNLP
/

eng_wolof

localenlp-en-wol

Model Card for `LOCALENLP/english-wolof`

Model Details

Model Description

Model Sources

Uses

Direct Use

Downstream Use

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Spaces using LocaleNLP/eng_wolof 2

Evaluation results

localenlp-en-wol

Model Card for LOCALENLP/english-wolof

Model Details

Model Description

Model Sources

Uses

Direct Use

Downstream Use

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Spaces using LocaleNLP/eng_wolof 2

Evaluation results

Model Card for `LOCALENLP/english-wolof`