IndicTrans3

IndicTrans3 is a multilingual translation model for 15 Indic languages. This repository provides an inference script that leverages vLLM for efficient and scalable translation.

The model is built on top of Gemma-3 and fine-tuned for document-level translation tasks. It supports both sentence-level and document-level translation in both directions:

  • English ↔ Indic Languages
  • Indic Languages ↔ English

🌐 Supported Languages

IndicTrans3 supports translation across a wide range of Indic languages. The primary set includes:

Assamese, Bengali, Gujarati, Hindi, Kannada, Maithili, Malayalam, Marathi, Nepali, Odia, Punjabi, Sanskrit, Tamil, Telugu, Urdu

In addition to these, IndicTrans3 extends preliminary support to the following 7 low-resource languages:

Bodo, Dogri, Kashmiri, Konkani, Manipuri, Santali, Sindhi

⚠️ Note: While these low-resource languages are supported, their translation quality may vary due to limited training data. We are actively working on improving support for these languages, and enhancements will be included in future releases.

πŸ› οΈ Installation

  1. Install PyTorch
    Follow the instructions based on your system and CUDA version from the official PyTorch website.

  2. Install required dependencies

    pip install vllm transformers
    
  3. Run Inference with vllm

python vllm-inference.py \
    --model <model_path> \
    --input_file <input_file> \
    --output_path <output_file> \
    --src_lang <source_language> \
    --tgt_lang <target_language> \
    --input_column <input_column> \
    --input_type <input_type> \

License

This model is licensed under the CC BY 4.0 license. You are free to share and adapt the material for any purpose, even commercially, as long as you provide appropriate credit, indicate if changes were made, and distribute your contributions under the same license.

Downloads last month
1,670
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for ai4bharat/IndicTrans3-beta

Quantizations
4 models

Spaces using ai4bharat/IndicTrans3-beta 2