Zero-Shot Classification
Safetensors
roberta
saattrupdan commited on
Commit
1c199b9
·
verified ·
1 Parent(s): 5b9b8f9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -84,7 +84,8 @@ The Scandinavian scores are the average of the Danish, Swedish and Norwegian sco
84
 
85
  | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
86
  | :-------- | :------------ | :--------- | :----------- | :----------- |
87
- | `alexandrainst/scandi-nli-large-v2` (this) | **75.42%** | **75.41%** | **84.95%** | 354M |
 
88
  | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 73.70% | 74.44% | 83.91% | 354M |
89
  | [`MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7`](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) | 69.01% | 71.99% | 80.66% | 279M |
90
  | [`alexandrainst/scandi-nli-base`](https://huggingface.co/alexandrainst/scandi-nli-base) | 67.42% | 71.54% | 80.09% | 178M |
@@ -102,7 +103,8 @@ The test split is generated using [this gist](https://gist.github.com/saattrupda
102
 
103
  | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
104
  | :-------- | :------------ | :--------- | :----------- | :----------- |
105
- | `alexandrainst/scandi-nli-large-v2` (this) | **75.65%** | **59.23%** | **87.89%** | 354M |
 
106
  | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 73.80% | 58.41% | 86.98% | 354M |
107
  | [`MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7`](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) | 68.37% | 57.10% | 83.25% | 279M |
108
  | [`alexandrainst/scandi-nli-base`](https://huggingface.co/alexandrainst/scandi-nli-base) | 62.44% | 55.00% | 80.42% | 178M |
@@ -120,6 +122,7 @@ We acknowledge that not evaluating on a gold standard dataset is not ideal, but
120
 
121
  | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
122
  | :-------- | :------------ | :--------- | :----------- | :----------- |
 
123
  | `alexandrainst/scandi-nli-large-v2` (this) | **79.02%** | **85.99%** | **85.99%** | 354M |
124
  | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 76.69% | 84.47% | 84.38% | 354M |
125
  | [`joeddav/xlm-roberta-large-xnli`](https://huggingface.co/joeddav/xlm-roberta-large-xnli) | 75.35% | 83.42% | 83.55% | 560M |
@@ -138,6 +141,7 @@ We acknowledge that not evaluating on a gold standard dataset is not ideal, but
138
 
139
  | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
140
  | :-------- | :------------ | :--------- | :----------- | :----------- |
 
141
  | `alexandrainst/scandi-nli-large-v2` (this) | **71.59%** | **81.00%** | **80.96%** | 354M |
142
  | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 70.61% | 80.43% | 80.36% | 354M |
143
  | [`joeddav/xlm-roberta-large-xnli`](https://huggingface.co/joeddav/xlm-roberta-large-xnli) | 67.99% | 78.68% | 78.60% | 560M |
 
84
 
85
  | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
86
  | :-------- | :------------ | :--------- | :----------- | :----------- |
87
+ | GPT-4o-mini | 68.88% | 68.57% | 76.89 | ? |
88
+ | `alexandrainst/scandi-nli-large-v2` (this) | 75.42% | 75.41% | 84.95% | 354M |
89
  | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 73.70% | 74.44% | 83.91% | 354M |
90
  | [`MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7`](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) | 69.01% | 71.99% | 80.66% | 279M |
91
  | [`alexandrainst/scandi-nli-base`](https://huggingface.co/alexandrainst/scandi-nli-base) | 67.42% | 71.54% | 80.09% | 178M |
 
103
 
104
  | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
105
  | :-------- | :------------ | :--------- | :----------- | :----------- |
106
+ | GPT-4o-mini | **90.14%** | **63.70%** | **95.16%** | ? |
107
+ | `alexandrainst/scandi-nli-large-v2` (this) | 75.65% | 59.23% | 87.89% | 354M |
108
  | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 73.80% | 58.41% | 86.98% | 354M |
109
  | [`MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7`](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) | 68.37% | 57.10% | 83.25% | 279M |
110
  | [`alexandrainst/scandi-nli-base`](https://huggingface.co/alexandrainst/scandi-nli-base) | 62.44% | 55.00% | 80.42% | 178M |
 
122
 
123
  | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
124
  | :-------- | :------------ | :--------- | :----------- | :----------- |
125
+ | GPT-4o-mini | 66.31% | 77.00% | 77.31% | ? |
126
  | `alexandrainst/scandi-nli-large-v2` (this) | **79.02%** | **85.99%** | **85.99%** | 354M |
127
  | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 76.69% | 84.47% | 84.38% | 354M |
128
  | [`joeddav/xlm-roberta-large-xnli`](https://huggingface.co/joeddav/xlm-roberta-large-xnli) | 75.35% | 83.42% | 83.55% | 560M |
 
141
 
142
  | **Model** | **MCC** | **Macro-F1** | **Accuracy** | **Number of Parameters** |
143
  | :-------- | :------------ | :--------- | :----------- | :----------- |
144
+ | GPT-4o-mini | 50.19% | 65.00% | 65.46% | ? |
145
  | `alexandrainst/scandi-nli-large-v2` (this) | **71.59%** | **81.00%** | **80.96%** | 354M |
146
  | [`alexandrainst/scandi-nli-large`](https://huggingface.co/alexandrainst/scandi-nli-large) | 70.61% | 80.43% | 80.36% | 354M |
147
  | [`joeddav/xlm-roberta-large-xnli`](https://huggingface.co/joeddav/xlm-roberta-large-xnli) | 67.99% | 78.68% | 78.60% | 560M |