Model Card for NLLB Summarization Model (TradeNewsSum)

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on the TradeNewsSum dataset for multilingual abstractive summarization of foreign trade news.

Model Details

Model Description

This model supports summarization in Russian and English, focusing on short, informative summaries of foreign trade news. It is based on the multilingual NLLB-200 architecture (distilled version) and was trained using the Hugging Face transformers library.

Uses

How to Use

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained("lyutovad/nllb-tradenewssum")
tokenizer = AutoTokenizer.from_pretrained("lyutovad/nllb-tradenewssum")

text = "Введите здесь ваш новостной текст / Input your news article here."
lang_code = "rus_Latn"  # or "eng_Latn"
tokenizer.src_lang = lang_code

inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=1024)
generated_ids = model.generate(**inputs, max_length=256, num_beams=4)
summary = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(summary)

Direct Use

  • Generate abstractive summaries of trade-related news
  • Automate summarization workflows for multilingual datasets

Out-of-Scope Use

  • General-purpose summarization beyond trade domain
  • Languages not included in training (non-Russian/non-English)

Evaluation

Testing Data

Test split of the TradeNewsSum dataset.

Factors

Evaluated separately for Russian and English subsets.

Metrics

Language ROUGE-1 ROUGE-2 ROUGE-L ROUGE-Lsum METEOR BERTScore-F1 NER-F1
ru 0.5948 0.4954 0.3446 0.4898 0.4900 0.9528 0.704
en 0.5225 0.5807 0.4400 0.5178 0.5178 0.9300 0.618

ROUGE: Measures n-gram overlap.
METEOR: Considers synonyms and stemming.
BERTScore: Semantic similarity using contextual embeddings.
NER-F1: Named entity preservation in summary.

Downloads last month
0
Safetensors
Model size
615M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train lyutovad/nllb-tradenewssum