Description

polarity3c is a classification model that is specialized for determining the polarity of texts from news portals. It was learned mostly on texts in Polish.

Annotations from the plWordnet were used as the basis for the data. A pre-learned model on these annotations, served as the model in Human in the loop, to support the annotations for teaching the final model. The final model was learned on web content; data was manually collected and annotated.

As a model base, the sdadas/polish-roberta-large-v2 model was used with a classification head. More about model construction is on our blog.

Architecture

RobertaForSequenceClassification(
  (roberta): RobertaModel(
    (embeddings): RobertaEmbeddings(
      (word_embeddings): Embedding(128001, 1024, padding_idx=1)
      (position_embeddings): Embedding(514, 1024, padding_idx=1)
      (token_type_embeddings): Embedding(1, 1024)
      (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): RobertaEncoder(
      (layer): ModuleList(
        (0-23): 24 x RobertaLayer(
          (attention): RobertaAttention(
            (self): RobertaSdpaSelfAttention(
              (query): Linear(in_features=1024, out_features=1024, bias=True)
              (key): Linear(in_features=1024, out_features=1024, bias=True)
              (value): Linear(in_features=1024, out_features=1024, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): RobertaSelfOutput(
              (dense): Linear(in_features=1024, out_features=1024, bias=True)
              (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): RobertaIntermediate(
            (dense): Linear(in_features=1024, out_features=4096, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): RobertaOutput(
            (dense): Linear(in_features=4096, out_features=1024, bias=True)
            (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
      )
    )
  )
  (classifier): RobertaClassificationHead(
    (dense): Linear(in_features=1024, out_features=1024, bias=True)
    (dropout): Dropout(p=0.1, inplace=False)
    (out_proj): Linear(in_features=1024, out_features=3, bias=True)
  )
)

Usage

Example of use with transformers pipeline:

from transformers import pipeline

classifier = pipeline(model="radlab/polarity-3c", task="text-classification")

classifier("Text to classification")

with sample data and top_k=3:

classifier("""
  Po upadku re偶imu Asada w Syrii, mieszka艅cy, borykaj膮cy si臋 z ub贸stwem,
  zacz臋li t艂umnie poszukiwa膰 skarb贸w, zach臋ceni legendami o zakopanych
  bogactwach i dost臋pno艣ci膮 wykrywaczy metali, kt贸re sta艂y si臋 popularnym
  towarem. Mimo, 偶e dzia艂alno艣膰 ta jest nielegalna, rz膮d przymyka oko,
  a sprzedawcy oferuj膮 urz膮dzenia nawet dla dzieci. Poszukiwacze skupiaj膮
  si臋 na obszarach historycznych, wierz膮c w legendy o skarbach ukrytych
  przez staro偶ytne cywilizacje i wojska osma艅skie, cho膰 eksperci ostrzegaj膮
  przed fa艂szywymi monetami i kradzie偶膮 artefakt贸w z muze贸w.""",
  top_k=3
)

the output is:

[{'label': 'ambivalent', 'score': 0.9995126724243164},
 {'label': 'negative', 'score': 0.00024663121439516544},
 {'label': 'positive', 'score': 0.00024063512682914734}]
Downloads last month
4
Safetensors
Model size
435M params
Tensor type
F32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for radlab/polarity-3c

Finetuned
(1)
this model