all-mpnet-base-v2-negation

This is a fine-tuned sentence-transformers model to perform better on negated pairs of sentences.

It maps sentences and paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.

Usage (Sentence-Transformers)

Using this model becomes easy when you have sentence-transformers installed:

pip install -U sentence-transformers

Then you can use the model like this:

from sentence_transformers import SentenceTransformer

sentences = [
    "I like rainy days because they make me feel relaxed.",
    "I don't like rainy days because they don't make me feel relaxed."
]

model = SentenceTransformer('dmlls/all-mpnet-base-v2-negation')
embeddings = model.encode(sentences)
print(embeddings)

Usage (HuggingFace Transformers)

Without sentence-transformers, you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.

from transformers import AutoTokenizer, AutoModel
import torch
import torch.nn.functional as F

# Mean Pooling - Take attention mask into account for correct averaging
def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output[0] #First element of model_output contains all token embeddings
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)


# Sentences we want sentence embeddings for
sentences = [
    "I like rainy days because they make me feel relaxed.",
    "I don't like rainy days because they don't make me feel relaxed."
]

# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained('dmlls/all-mpnet-base-v2-negation')
model = AutoModel.from_pretrained('dmlls/all-mpnet-base-v2-negation')

# Tokenize sentences
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

# Compute token embeddings
with torch.no_grad():
    model_output = model(**encoded_input)

# Perform pooling
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])

# Normalize embeddings
sentence_embeddings = F.normalize(sentence_embeddings, p=2, dim=1)

print(sentence_embeddings)

Background

This model was finetuned within the context of the This is not correct! Negation-aware Evaluation of Language Generation Systems paper.

Intended uses

Our model is intended to be used as a sentence and short paragraph encoder, performing well (i.e., reporting lower similarity scores) on negated pairs of sentences when compared to its base model.

Given an input text, it outputs a vector which captures the semantic information. The sentence vector may be used for information retrieval, clustering or sentence similarity tasks.

By default, input text longer than 384 word pieces is truncated.

Training procedure

Pre-training

We used sentence-transformers/all-mpnet-base-v2 as base model.

Fine-tuning

We fine-tuned the model on the CANNOT dataset using a contrastive objective. Formally, we compute the cosine similarity from each possible sentence pairs from the batch. We then apply the cross entropy loss by comparing with true pairs.

Hyper parameters

We followed an analogous approach to how other Sentence Transformers were trained. We took the first 90% of samples from the CANNOT dataset as the training split. We used a batch size of 64 and trained for 1 epoch.

dmlls
/

all-mpnet-base-v2-negation