Update README.md
Browse files
README.md
CHANGED
@@ -84,7 +84,8 @@ The Scandinavian scores are the average of the Danish, Swedish and Norwegian sco
|
|
84 |
|
85 |
| **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
|
86 |
| :-------- | :------------ | :--------- | :----------- | :----------- |
|
87 |
-
|
|
|
|
88 |
| [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 73.70% | 74.44% | 83.91% | 354M |
|
89 |
| [`MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7`](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) | 69.01% | 71.99% | 80.66% | 279M |
|
90 |
| [`alexandrainst/scandi-nli-base`](https://huggingface.co/alexandrainst/scandi-nli-base) | 67.42% | 71.54% | 80.09% | 178M |
|
@@ -102,7 +103,8 @@ The test split is generated using [this gist](https://gist.github.com/saattrupda
|
|
102 |
|
103 |
| **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
|
104 |
| :-------- | :------------ | :--------- | :----------- | :----------- |
|
105 |
-
|
|
|
|
106 |
| [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 73.80% | 58.41% | 86.98% | 354M |
|
107 |
| [`MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7`](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) | 68.37% | 57.10% | 83.25% | 279M |
|
108 |
| [`alexandrainst/scandi-nli-base`](https://huggingface.co/alexandrainst/scandi-nli-base) | 62.44% | 55.00% | 80.42% | 178M |
|
@@ -120,6 +122,7 @@ We acknowledge that not evaluating on a gold standard dataset is not ideal, but
|
|
120 |
|
121 |
| **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
|
122 |
| :-------- | :------------ | :--------- | :----------- | :----------- |
|
|
|
123 |
| `alexandrainst/scandi-nli-large-v2` (this) | **79.02%** | **85.99%** | **85.99%** | 354M |
|
124 |
| [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 76.69% | 84.47% | 84.38% | 354M |
|
125 |
| [`joeddav/xlm-roberta-large-xnli`](https://huggingface.co/joeddav/xlm-roberta-large-xnli) | 75.35% | 83.42% | 83.55% | 560M |
|
@@ -138,6 +141,7 @@ We acknowledge that not evaluating on a gold standard dataset is not ideal, but
|
|
138 |
|
139 |
| **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
|
140 |
| :-------- | :------------ | :--------- | :----------- | :----------- |
|
|
|
141 |
| `alexandrainst/scandi-nli-large-v2` (this) | **71.59%** | **81.00%** | **80.96%** | 354M |
|
142 |
| [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 70.61% | 80.43% | 80.36% | 354M |
|
143 |
| [`joeddav/xlm-roberta-large-xnli`](https://huggingface.co/joeddav/xlm-roberta-large-xnli) | 67.99% | 78.68% | 78.60% | 560M |
|
|
|
84 |
|
85 |
| **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
|
86 |
| :-------- | :------------ | :--------- | :----------- | :----------- |
|
87 |
+
| GPT-4o-mini | 68.88% | 68.57% | 76.89 | ? |
|
88 |
+
| `alexandrainst/scandi-nli-large-v2` (this) | 75.42% | 75.41% | 84.95% | 354M |
|
89 |
| [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 73.70% | 74.44% | 83.91% | 354M |
|
90 |
| [`MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7`](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) | 69.01% | 71.99% | 80.66% | 279M |
|
91 |
| [`alexandrainst/scandi-nli-base`](https://huggingface.co/alexandrainst/scandi-nli-base) | 67.42% | 71.54% | 80.09% | 178M |
|
|
|
103 |
|
104 |
| **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
|
105 |
| :-------- | :------------ | :--------- | :----------- | :----------- |
|
106 |
+
| GPT-4o-mini | **90.14%** | **63.70%** | **95.16%** | ? |
|
107 |
+
| `alexandrainst/scandi-nli-large-v2` (this) | 75.65% | 59.23% | 87.89% | 354M |
|
108 |
| [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 73.80% | 58.41% | 86.98% | 354M |
|
109 |
| [`MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7`](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) | 68.37% | 57.10% | 83.25% | 279M |
|
110 |
| [`alexandrainst/scandi-nli-base`](https://huggingface.co/alexandrainst/scandi-nli-base) | 62.44% | 55.00% | 80.42% | 178M |
|
|
|
122 |
|
123 |
| **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
|
124 |
| :-------- | :------------ | :--------- | :----------- | :----------- |
|
125 |
+
| GPT-4o-mini | 66.31% | 77.00% | 77.31% | ? |
|
126 |
| `alexandrainst/scandi-nli-large-v2` (this) | **79.02%** | **85.99%** | **85.99%** | 354M |
|
127 |
| [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 76.69% | 84.47% | 84.38% | 354M |
|
128 |
| [`joeddav/xlm-roberta-large-xnli`](https://huggingface.co/joeddav/xlm-roberta-large-xnli) | 75.35% | 83.42% | 83.55% | 560M |
|
|
|
141 |
|
142 |
| **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
|
143 |
| :-------- | :------------ | :--------- | :----------- | :----------- |
|
144 |
+
| GPT-4o-mini | 50.19% | 65.00% | 65.46% | ? |
|
145 |
| `alexandrainst/scandi-nli-large-v2` (this) | **71.59%** | **81.00%** | **80.96%** | 354M |
|
146 |
| [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 70.61% | 80.43% | 80.36% | 354M |
|
147 |
| [`joeddav/xlm-roberta-large-xnli`](https://huggingface.co/joeddav/xlm-roberta-large-xnli) | 67.99% | 78.68% | 78.60% | 560M |
|