RuBERT base fine-tuned on ruDEFT and WCL Wiki Ru datasets.

The model aims to detect definitions in a text (detecting a definition_label column in a dataset.)

import torch
from transformers import AutoTokenizer, BertForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("psytechlab/wcl-wiki_rudeft__rubert-model")
model = BertForSequenceClassification.from_pretrained("psytechlab/wcl-wiki_rudeft__rubert-model")
model.eval()

text = ["москва - это город в РФ", "хочу изучать языки"]

tokenized_text = tokenizer(text, padding="max_length", truncation=True, max_length=512, return_tensors="pt")

with torch.no_grad():
    prediction = model(**tokenized_text).logits
    print(prediction.argmax(dim=1).numpy())
# [1 0]

Preprocessing

  • lower_string
  • remove_punct
  • remove_latin
  • swap_enter_to_space
  • collapse_spaces
  • strip_string

Training procedure

Training

The training was done with Trainier class that has next parameters:

training_args = TrainingArguments(
        num_train_epochs=7,
        per_device_train_batch_size=8,
        per_device_eval_batch_size=8,
        weight_decay=0.01,
        learning_rate=3e-5,
        logging_strategy="steps", 
        logging_steps=50,
        save_strategy="epoch",
        save_total_limit=5,
        seed=21,
        metric_for_best_model="eval_f1_macro"
    )

Metrics

Metrics on combined set (ruDEFT + WCL Wiki Ru) psytechlab/rus_rudeft_wcl-wiki:


              precision    recall  f1-score   support

           0       0.90      0.93      0.92      1421
           1       0.87      0.81      0.84       753

    accuracy                           0.89      2174
   macro avg       0.88      0.87      0.88      2174
weighted avg       0.89      0.89      0.89      2174

Metrics only on astromis/ruDEFT:


              precision    recall  f1-score   support

           0       0.87      0.95      0.91       836
           1       0.84      0.67      0.74       353

    accuracy                           0.86      1189
   macro avg       0.85      0.81      0.82      1189
weighted avg       0.86      0.86      0.86      1189

Metrics only on astromis/WCL_Wiki_Ru:


              precision    recall  f1-score   support

           0       0.95      0.92      0.93       585
           1       0.89      0.93      0.91       400

    accuracy                           0.92       985
   macro avg       0.92      0.92      0.92       985
weighted avg       0.92      0.92      0.92       985

Citation

@article{Popov2025TransferringNL, title={Transferring Natural Language Datasets Between Languages Using Large Language Models for Modern Decision Support and Sci-Tech Analytical Systems}, author={Dmitrii Popov and Egor Terentev and Danil Serenko and Ilya Sochenkov and Igor Buyanov}, journal={Big Data and Cognitive Computing}, year={2025}, url={https://api.semanticscholar.org/CorpusID:278179500} }

Downloads last month
8
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for psytechlab/wcl-wiki_rudeft__rubert-model

Finetuned
(61)
this model

Dataset used to train psytechlab/wcl-wiki_rudeft__rubert-model