A zero-shot classifier based on bertin-roberta-base-spanish
This model was trained on the basis of the model bertin-roberta-base-spanish
using Cross encoder for NLI task. A CrossEncoder takes a sentence pair as input and outputs a label so it learns to predict the labels: "contradiction": 0, "entailment": 1, "neutral": 2.
You can use it with Hugging Face's Zero-shot pipeline to make zero-shot classifications. Given a sentence and an arbitrary set of labels/topics, it will output the likelihood of the sentence belonging to each of the topic.
Usage (HuggingFace Transformers)
The simplest way to use the model is the huggingface transformers pipeline tool. Just initialize the pipeline specifying the task as "zero-shot-classification" and select "hackathon-pln-es/bertin-roberta-base-zeroshot-esnli" as model.
from transformers import pipeline
classifier = pipeline("zero-shot-classification",
model="hackathon-pln-es/bertin-roberta-base-zeroshot-esnli")
classifier(
"El autor se perfila, a los 50 años de su muerte, como uno de los grandes de su siglo",
candidate_labels=["cultura", "sociedad", "economia", "salud", "deportes"],
hypothesis_template="Esta oración es sobre {}."
)
The hypothesis_template
parameter is important and should be in Spanish. In the widget on the right, this parameter is set to its default value: "This example is {}.", so different results are expected.
Training
We used sentence-transformers to train the model.
Dataset
We used a collection of datasets of Natural Language Inference as training data:
The whole dataset used is available here.
Authors
- Downloads last month
- 77