alexandrainst
/

scandi-nli-large-v2

Zero-Shot Classification

Safetensors

roberta

Model card Files Files and versions

xet

Community

saattrupdan commited on Apr 26

Commit

1c199b9

verified ·

1 Parent(s): 5b9b8f9

Update README.md

Browse files

Files changed (1) hide show

README.md +6 -2

README.md CHANGED Viewed

@@ -84,7 +84,8 @@ The Scandinavian scores are the average of the Danish, Swedish and Norwegian sco
 | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
 | :-------- | :------------ | :--------- | :----------- | :----------- |
-| `alexandrainst/scandi-nli-large-v2` (this) | **75.42%** | **75.41%** | **84.95%** | 354M |
 | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 73.70% | 74.44% | 83.91% | 354M |
 | [`MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7`](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) | 69.01% | 71.99% | 80.66% | 279M |
 | [`alexandrainst/scandi-nli-base`](https://huggingface.co/alexandrainst/scandi-nli-base) | 67.42% | 71.54% | 80.09% | 178M |
@@ -102,7 +103,8 @@ The test split is generated using [this gist](https://gist.github.com/saattrupda
 | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
 | :-------- | :------------ | :--------- | :----------- | :----------- |
-| `alexandrainst/scandi-nli-large-v2` (this) | **75.65%** | **59.23%** | **87.89%** | 354M |
 | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 73.80% | 58.41% | 86.98% | 354M |
 | [`MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7`](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) | 68.37% | 57.10% | 83.25% | 279M |
 | [`alexandrainst/scandi-nli-base`](https://huggingface.co/alexandrainst/scandi-nli-base) | 62.44% | 55.00% | 80.42% | 178M |
@@ -120,6 +122,7 @@ We acknowledge that not evaluating on a gold standard dataset is not ideal, but
 | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
 | :-------- | :------------ | :--------- | :----------- | :----------- |
 | `alexandrainst/scandi-nli-large-v2` (this) | **79.02%** | **85.99%** | **85.99%** | 354M |
 | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 76.69% | 84.47% | 84.38% | 354M |
 | [`joeddav/xlm-roberta-large-xnli`](https://huggingface.co/joeddav/xlm-roberta-large-xnli) | 75.35% | 83.42% | 83.55% | 560M |
@@ -138,6 +141,7 @@ We acknowledge that not evaluating on a gold standard dataset is not ideal, but
 | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
 | :-------- | :------------ | :--------- | :----------- | :----------- |
 | `alexandrainst/scandi-nli-large-v2` (this) | **71.59%** | **81.00%** | **80.96%** | 354M |
 | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 70.61% | 80.43% | 80.36% | 354M |
 | [`joeddav/xlm-roberta-large-xnli`](https://huggingface.co/joeddav/xlm-roberta-large-xnli) | 67.99% | 78.68% | 78.60% | 560M |

 | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
 | :-------- | :------------ | :--------- | :----------- | :----------- |
+| GPT-4o-mini | 68.88% | 68.57% | 76.89 | ? |
+| `alexandrainst/scandi-nli-large-v2` (this) | 75.42% | 75.41% | 84.95% | 354M |
 | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 73.70% | 74.44% | 83.91% | 354M |
 | [`MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7`](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) | 69.01% | 71.99% | 80.66% | 279M |
 | [`alexandrainst/scandi-nli-base`](https://huggingface.co/alexandrainst/scandi-nli-base) | 67.42% | 71.54% | 80.09% | 178M |
 | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
 | :-------- | :------------ | :--------- | :----------- | :----------- |
+| GPT-4o-mini | **90.14%** | **63.70%** | **95.16%** | ? |
+| `alexandrainst/scandi-nli-large-v2` (this) | 75.65% | 59.23% | 87.89% | 354M |
 | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 73.80% | 58.41% | 86.98% | 354M |
 | [`MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7`](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) | 68.37% | 57.10% | 83.25% | 279M |
 | [`alexandrainst/scandi-nli-base`](https://huggingface.co/alexandrainst/scandi-nli-base) | 62.44% | 55.00% | 80.42% | 178M |
 | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
 | :-------- | :------------ | :--------- | :----------- | :----------- |
+| GPT-4o-mini | 66.31% | 77.00% | 77.31% | ? |
 | `alexandrainst/scandi-nli-large-v2` (this) | **79.02%** | **85.99%** | **85.99%** | 354M |
 | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 76.69% | 84.47% | 84.38% | 354M |
 | [`joeddav/xlm-roberta-large-xnli`](https://huggingface.co/joeddav/xlm-roberta-large-xnli) | 75.35% | 83.42% | 83.55% | 560M |
 | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
 | :-------- | :------------ | :--------- | :----------- | :----------- |
+| GPT-4o-mini | 50.19% | 65.00% | 65.46% | ? |
 | `alexandrainst/scandi-nli-large-v2` (this) | **71.59%** | **81.00%** | **80.96%** | 354M |
 | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 70.61% | 80.43% | 80.36% | 354M |
 | [`joeddav/xlm-roberta-large-xnli`](https://huggingface.co/joeddav/xlm-roberta-large-xnli) | 67.99% | 78.68% | 78.60% | 560M |