pierluigic's picture
Create README.md
732b48b verified
|
raw
history blame
2.06 kB
metadata
license: apache-2.0

Cross-Encoder for Word Sense Relationships Classification

This model was trained on word sense relationships extracted by WordNet for the semantic change type classification.

The model can be used to detect which kind of relatioships (among homonymy, antonymy, hypernonym, hyponymy, and co-hypnomy) intercur between word senses: Given a pair of word sense definitions, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order.

The training code is available here: SBERT.net Training MS Marco

Usage with Transformers

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained('model_name')
tokenizer = AutoTokenizer.from_pretrained('model_name')

features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'],  padding=True, truncation=True, return_tensors="pt")

model.eval()
with torch.no_grad():
    scores = model(**features).logits
    print(scores)

Usage with SentenceTransformers

The usage becomes easier when you have SentenceTransformers installed. Then, you can use the pre-trained models like this:

from sentence_transformers import CrossEncoder
model = CrossEncoder('model_name', max_length=512)
labels = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2') , ('Query', 'Paragraph3')])

Performance

In the following table, we provide various pre-trained Cross-Encoders together with their performance on the

alt text