Translation
Safetensors
llama

LaMaTE

Model Description

LaMaTE is a high-performance and efficient translation model developed based on Llama-3-8B. It utilizes large language models (LLMs) as machine translation(MT) encoders, paired with lightweight decoders. The model integrates an adapter to bridge LLM representations with the decoder, employing a two-stage training strategy to enhance performance and efficiency.

Key Features of LaMaTE

  • Enhanced Efficiency: Offers 2.4× to 6.5× faster decoding speeds.
  • Reduced Memory Usage: Reduces KV cache memory consumption by 75%.
  • Competitive Performance: Exhibits robust performance across diverse translation tasks.

A Quick Start

For more detailed usage, please refer to github

Note: Our implementation is developed with transformers v4.39.2. We recommend installing this version for best compatibility.

To deploy LaMaTE, utilize the from_pretrained() method followed by the generate() method for immediate use:

from modeling_llama_seq2seq import LlamaCrossAttentionEncDec
from transformers import AutoTokenizer, AutoConfig

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
config = AutoConfig.from_pretrained(model_name_or_path, trust_remote_code=True)
model = LlamaCrossAttentionEncDec.from_pretrained(model_name_or_path, config=config)

prompt = "Translate the following text from English into Chinese.\nEnglish: The harder you work at it, the more progress you will make.\nChinese: ",
input_ids = tokenizer(prompt, return_tensors="pt")
outputs_tokenized = model.generate(
    **input_ids,
    num_beams=5,
    do_sample=False
)
outputs = tokenizer.batch_decode(outputs_tokenized, skip_special_tokens=True)
print(outputs) 

Citation

@misc{luoyf2025lamate,
      title={Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation}, 
      author={Yingfeng Luo, Tong Zheng, Yongyu Mu, Bei Li, Qinghong Zhang, Yongqi Gao, Ziqiang Xu, Peinan Feng, Xiaoqian Liu, Tong Xiao, Jingbo Zhu},
      year={2025},
      eprint={2503.06594},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
Downloads last month
19
Safetensors
Model size
8.14B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for NiuTrans/LaMaTE

Finetuned
(403)
this model

Dataset used to train NiuTrans/LaMaTE