|
--- |
|
license: mit |
|
datasets: |
|
- dleemiller/wiki-sim |
|
- sentence-transformers/stsb |
|
language: |
|
- en |
|
metrics: |
|
- spearmanr |
|
- pearsonr |
|
base_model: |
|
- jhu-clsp/ettin-encoder-17m |
|
pipeline_tag: text-classification |
|
library_name: sentence-transformers |
|
tags: |
|
- cross-encoder |
|
- modernbert |
|
- sts |
|
- stsb |
|
- stsbenchmark-sts |
|
model-index: |
|
- name: CrossEncoder based on jhu-clsp/ettin-encoder-17m |
|
results: |
|
- task: |
|
type: semantic-similarity |
|
name: Semantic Similarity |
|
dataset: |
|
name: sts test |
|
type: sts-test |
|
metrics: |
|
- type: pearson_cosine |
|
value: 0.8413715686076841 |
|
name: Pearson Cosine |
|
- type: spearman_cosine |
|
value: 0.8310895302151975 |
|
name: Spearman Cosine |
|
- task: |
|
type: semantic-similarity |
|
name: Semantic Similarity |
|
dataset: |
|
name: sts dev |
|
type: sts-dev |
|
metrics: |
|
- type: pearson_cosine |
|
value: 0.8815197312565873 |
|
name: Pearson Cosine |
|
- type: spearman_cosine |
|
value: 0.8786002071426082 |
|
name: Spearman Cosine |
|
--- |
|
|
|
# EttinX Cross-Encoder: Semantic Similarity (STS) |
|
|
|
Cross encoders are high performing encoder models that compare two texts and output a 0-1 score. |
|
I've found the `cross-encoders/roberta-large-stsb` model to be very useful in creating evaluators for LLM outputs. |
|
They're simple to use, fast and very accurate. |
|
|
|
The Ettin series followed up with new encoders trained on the ModernBERT architecture, with a range of sizes, starting at 17M. |
|
The reduced parameters and computationally efficient interleaved local/global attention layers make this a very fast model, |
|
which can easily process a few hundred sentence pairs per second on CPU, and a few thousand per second on my A6000. |
|
|
|
--- |
|
|
|
## Features |
|
- **High performing:** Achieves **Pearson: 0.8414** and **Spearman: 0.8311** on the STS-Benchmark test set. |
|
- **Efficient architecture:** Based on the Ettin-encoder design (17M parameters), offering very fast inference speeds. |
|
- **Extended context length:** Processes sequences up to 8192 tokens, great for LLM output evals. |
|
- **Diversified training:** Pretrained on `dleemiller/wiki-sim` and fine-tuned on `sentence-transformers/stsb`. |
|
|
|
--- |
|
|
|
## Performance |
|
|
|
|
|
| Model | STS-B Test Pearson | STS-B Test Spearman | Context Length | Parameters | Speed | |
|
|--------------------------------|--------------------|---------------------|----------------|------------|---------| |
|
| `ModernCE-large-sts` | **0.9256** | **0.9215** | **8192** | 395M | **Medium** | |
|
| `ModernCE-base-sts` | **0.9162** | **0.9122** | **8192** | 149M | **Fast** | |
|
| `stsb-roberta-large` | 0.9147 | - | 512 | 355M | Slow | |
|
| `EttinX-sts-m` | 0.9143 | 0.9102 | **8192** | 149M | **Fast** | |
|
| `EttinX-sts-s` | 0.9004 | 0.8926 | **8192** | 68M | **Very Fast** | |
|
| `stsb-distilroberta-base` | 0.8792 | - | 512 | 82M | Fast | |
|
| `EttinX-sts-xs` | 0.8763 | 0.8689 | **8192** | 32M | **Very Fast** | |
|
| `EttinX-sts-xxs` | 0.8414 | 0.8311 | **8192** | 17M | **Very Fast** | |
|
|
|
|
|
--- |
|
|
|
## Usage |
|
|
|
To use EttinX for semantic similarity tasks, you can load the model with the Hugging Face `sentence-transformers` library: |
|
|
|
```python |
|
from sentence_transformers import CrossEncoder |
|
|
|
# Load EttinX model |
|
model = CrossEncoder("dleemiller/EttinX-sts-xxs") |
|
|
|
# Predict similarity scores for sentence pairs |
|
sentence_pairs = [ |
|
("It's a wonderful day outside.", "It's so sunny today!"), |
|
("It's a wonderful day outside.", "He drove to work earlier."), |
|
] |
|
scores = model.predict(sentence_pairs) |
|
|
|
print(scores) # Outputs: array([0.9184, 0.0123], dtype=float32) |
|
``` |
|
|
|
### Output |
|
The model returns similarity scores in the range `[0, 1]`, where higher scores indicate stronger semantic similarity. |
|
|
|
--- |
|
|
|
## Training Details |
|
|
|
### Pretraining |
|
The model was pretrained on the `pair-score-sampled` subset of the [`dleemiller/wiki-sim`](https://huggingface.co/datasets/dleemiller/wiki-sim) dataset. This dataset provides diverse sentence pairs with semantic similarity scores, helping the model build a robust understanding of relationships between sentences. |
|
- **Classifier Dropout:** a somewhat large classifier dropout of 0.3, to reduce overreliance on teacher scores. |
|
- **Objective:** STS-B scores from `cross-encoder/stsb-roberta-large`. |
|
|
|
### Fine-Tuning |
|
Fine-tuning was performed on the [`sentence-transformers/stsb`](https://huggingface.co/datasets/sentence-transformers/stsb) dataset. |
|
|
|
### Validation Results |
|
The model achieved the following test set performance after fine-tuning: |
|
- **Pearson Correlation:** 0.8414 |
|
- **Spearman Correlation:** 0.8311 |
|
|
|
--- |
|
|
|
## Model Card |
|
|
|
- **Architecture:** Ettin-encoder-17m |
|
- **Tokenizer:** Custom tokenizer trained with modern techniques for long-context handling. |
|
- **Pretraining Data:** `dleemiller/wiki-sim (pair-score-sampled)` |
|
- **Fine-Tuning Data:** `sentence-transformers/stsb` |
|
|
|
--- |
|
|
|
## Thank You |
|
|
|
Thanks to the Johns Hopkins team for providing the ModernBERT models, and the Sentence Transformers team for their leadership in transformer encoder models. |
|
|
|
--- |
|
|
|
## Citation |
|
|
|
If you use this model in your research, please cite: |
|
|
|
```bibtex |
|
@misc{ettinxstsb2025, |
|
author = {Miller, D. Lee}, |
|
title = {EttinX STS: An STS cross encoder model}, |
|
year = {2025}, |
|
publisher = {Hugging Face Hub}, |
|
url = {https://huggingface.co/dleemiller/EttinX-sts-xxs}, |
|
} |
|
``` |
|
|
|
--- |
|
|
|
## License |
|
|
|
This model is licensed under the [MIT License](LICENSE). |
|
|