T5-dhivehi-typo-corrector-asr
This model is a fine-tuned version of t5-small specifically designed to correct typographic and transcription errors in Dhivehi text, especially those arising from automatic speech recognition (ASR) systems. It is optimized for ASR output cleanup tasks and may not perform reliably on general-purpose text correction or with other model inputs outside the scope of Dhivehi ASR error correction. For best results, use this model only within the context of post-processing Dhivehi ASR outputs.
Usage
You can use this model to correct ASR-generated text from Dhivehi audio. Here's an example using Hugging Face Transformers:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("alakxender/t5-dhivehi-typo-corrector-asr")
model = AutoModelForSeq2SeqLM.from_pretrained("alakxender/t5-dhivehi-typo-corrector-asr")
input_text = "މަސްދޫކޮށް ފަހަރަކު"
input_ids = tokenizer("fix: " + input_text, return_tensors="pt").input_ids
outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Performance
- Final Validation Loss: 0.3487
- Downloads last month
- 62
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for alakxender/t5-dhivehi-typo-corrector-asr
Base model
google-t5/t5-small