--- license: apache-2.0 language: - pt - vmw datasets: - LIACC/Emakhuwa-Portuguese-News-MT base_model: - facebook/nllb-200-distilled-600M pipeline_tag: translation --- # CTranslate2 NLLB-200 Translation Example This guide demonstrates how to use NLLB-finetuned model for bilingual translation between Portuguese (`por_Latn`) and a target language (`vmw_Latn`). ## Prerequisites - Install required packages: ```bash pip install transformers torch ``` ## Inference ```python from transformers import AutoModelForSeq2SeqLM, NllbTokenizer, AutoTokenizer import torch src_lang="por_Latn" tgt_lang="vmw_Latn" text="Olá mundo das língua!" device = "cuda:0" if torch.cuda.is_available() else "cpu" model_name="felerminoali/nllb200_pt_vmw_bilingual_ver1" model = AutoModelForSeq2SeqLM.from_pretrained(model_name).to(device) tokenizer = NllbTokenizer.from_pretrained(model_name) tokenizer.src_lang = src_lang tokenizer.tgt_lang = tgt_lang inputs = tokenizer( text, return_tensors='pt', padding=True, truncation=True, max_length=1024 ) model.eval() # turn off training mode result = model.generate( **inputs.to(model.device), forced_bos_token_id=tokenizer.convert_tokens_to_ids(tgt_lang) ) print(tokenizer.batch_decode(result, skip_special_tokens=True)[0]) ```