Model Card for Eng-to-Telugu Colloquial Translation Model

Model Details

Model Description

This model is a fine-tuned version of unsloth/mistral-7b-instruct-v0.3-bnb-4bit, specifically optimized for translating English text into colloquial Telugu. It is designed to produce natural and fluent translations that reflect spoken language rather than formal or literary text.

  • Developed by: Dragonspirit21
  • Model type: Large Language Model (LLM) optimized for low-bit precision (bnb-4bit)
  • Language(s) (NLP): English, Telugu
  • License: Apache 2.0
  • Finetuned from model: unsloth/mistral-7b-instruct-v0.3-bnb-4bit
  • PEFT Version: 0.14.0

Model Sources

Uses

Direct Use

This model can be used to translate English text into spoken-style Telugu, making it suitable for conversational AI, subtitles, and informal communication.

Downstream Use

It can be further fine-tuned for domain-specific applications such as customer service chatbots, dubbing scripts, and regional content localization.

Out-of-Scope Use

Not recommended for highly technical, legal, or academic translations where formal Telugu is required.

Bias, Risks, and Limitations

This model, like all AI models, may inherit biases from the training data. It may produce translations that are not always accurate or culturally appropriate. Users should verify critical translations before use.

Recommendations

  • Verify translations for sensitive or professional contexts.
  • Fine-tune on domain-specific data if high accuracy is required.

Training Details

Training Data

This model was fine-tuned using a dataset consisting of parallel English-Telugu colloquial text pairs, sourced from open datasets and user-generated content.

Training Procedure

Preprocessing

  • Data cleaning and normalization were performed.
  • Text was tokenized using a custom tokenizer for Telugu.
  • Applied low-bit quantization using bnb-4bit for efficient inference.

Training Hyperparameters

  • Training regime: bf16 mixed precision
  • Batch size: 8
  • Epochs: 3
  • Optimizer: AdamW_8bit
  • PEFT Version: 0.14.0

Speeds, Sizes, Times

  • Model size: 7B parameters (bnb-4bit quantized)
  • Training time: ~10 hours on A100 GPU
  • Checkpoints: Available in the Hugging Face model repo

Evaluation

Testing Data, Factors & Metrics

Testing Data

  • A held-out set of English-Telugu colloquial text pairs.

Factors

  • Performance measured based on fluency, accuracy, and naturalness of translation.

Results

The model achieves high fluency in colloquial translations but may struggle with rare idioms and niche technical terms.

Summary

The model is well-suited for conversational translation tasks but should be fine-tuned further for specific domains requiring higher accuracy.

Model Examination

  • The model was evaluated for hallucinations and bias using sample translations.
  • Some inconsistencies were found in idiomatic expressions, which may need additional fine-tuning.

Environmental Impact

Carbon emissions were estimated using the Machine Learning Impact calculator.

  • Hardware Type: A100 GPU
  • Hours used: ~10 hours
  • Cloud Provider: Kaggle

Technical Specifications

Model Architecture and Objective

  • Based on the Mistral 7B transformer architecture with LoRA adapters for efficient fine-tuning.
  • Optimized for low-bit precision (bnb-4bit) using the Unsloth framework.

Compute Infrastructure

Hardware

  • GPU: Tesla T4 (Kaggle)
  • VRAM: 15GB per GPU

Software

  • PEFT Version: 0.14.0
  • Transformers Version: 4.37.0
  • Python Version: 3.10

Citation

BibTeX:

@misc{dragonspirit21_2025,
  author = {Dragonspirit21},
  title = {Eng-to-Telugu Colloquial Translation Model},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Dragonspirit21/Engtotelugu_colloquial}}
}

APA:

Dragonspirit21. (2025). Eng-to-Telugu Colloquial Translation Model. Hugging Face. Retrieved from https://huggingface.co/Dragonspirit21/Engtotelugu_colloquial

Glossary

  • bnb-4bit: 4-bit quantization using bitsandbytes for efficient inference.
  • PEFT: Parameter-Efficient Fine-Tuning framework.
  • LoRA: Low-Rank Adaptation, a technique for fine-tuning large language models efficiently.

More Information

For more details, visit the Hugging Face Model Repo.

Model Card Authors

  • Dragonspirit21

Model Card Contact

For inquiries, please reach out via Hugging Face or email at [More Information Needed].

Framework versions

  • PEFT 0.14.0
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Dragonspirit21/Engtotelugu_colloquial